Developer Tips in AI Prompt Engineering
The pay range for prompt engineering jobs is in the $212,000 to $335,000 range for a job that basically requires a persistent grasp of English, or so you’re led to believe reading about this new technology career.
But I discovered it’s a bit more complicated than that, while taking DeepLearning.AI and OpenAI’s free course on prompt engineering for developers. The course showed that developer skills, Python and familiarity with Jupyter Notebook are key to fine-tuning AI prompts and ensuring that it outputs usable code.
“The power of large language models as a developer tool, that is using APIs to allow us to quickly build software applications, I think that is still very under-appreciated,” said course instructor Andrew Ng, founder of DeepLearning.AI, co-founder of Coursera, and an adjunct professor at Stanford University’s Computer Science Department. “In fact, my team at AI Fund, which is a sister company to DeepLearning.AI, has been working with many startups on applying these technologies to many different applications. And it’s been exciting to see what APIs can enable developers to very quickly build.”
Isabella Fulford, a member of OpenAI’s technical staff, joined Ng as an co-instructor in the hour-long course (tinkering time may vary). The course distinguishes between base LLMs and instruction-tuned LLMs. Base-tuned LLMs have been trained to predict the next word, based on text training data, and are often trained on large amounts of data from the internet and other sources to figure out what’s the next most likely word to follow. Instruction-tuned LLMs have been trained to follow instructions and answer questions and that is where a lot of the momentum of LLM research and practice has been going, Ng said.
Instruction-tuned LLMs (the focus of this course) are trained to be helpful, honest and harmless, Ng said, so they’re less likely to output problematic text such as toxic outputs, compared to Base LLMs.
First Principle: Specific Is Better
The first principle for AI development Ng explored was how to give instructions to an LLM.
“When you use an instruction-tuned LLM, think of giving instructions to another person — say someone who’s smart, but doesn’t know the specifics of your task,” Ng said. “When an LLM doesn’t work, sometimes it’s because the instructions weren’t clear enough. For example, if you were to say, please write me something about Alan Turing. Well, in addition to that, it can be helpful to be clear about whether you want the text to focus on his scientific work or his personal life or his role in history or something else.”
It also helps to specify what tone you want the answer in, he said. You might want a professional journalism tone or prefer a more casual tone for a friend. You might also input and specify any text snippets the AI should leverage to create its draft.
“You should express what you want a model to do by providing instructions that are as clear and specific as you can possibly make them,” Fulford said. “This will guide the model towards the desired output and reduce the chance that you get a relevant or incorrect response.”
Clear writing doesn’t necessarily mean creating a short prompt, as in many cases longer prompts actually provide more clarity and context for the model, leading to more detailed and relevant outputs, she added. She outlined several tactics to create specific prompts.
“The first tactic to help you write clear and specific instructions is to use delimiters to clearly indicate distinct parts of the input,” Fulford said, pasting an example into Jupyter Notebooks. “Delimiters can be any clear punctuation that separates specific pieces of text from the rest of the prompt — these could be backticks, you could use quotes, you could use XML tags, section titles, anything that makes it clear to the model that this is a separate section,” she said. “ Using delimiters is also a helpful technique to try and avoid prompt injections; and what a prompt injection is, is if a user is allowed to add some input into your prompt, they might give conflicting instructions to the model that might make it follow the user’s instructions, rather than doing what you wanted it to do.”
“The next tactic is to ask for a structured output,” she continued. “To make parsing the model up, it can be helpful to ask for a structured output like HTML or JSON.”
“The next tactic is to ask the model to check whether conditions are satisfied. So if the task makes assumptions that aren’t necessarily satisfied, then we can tell the model to check these assumptions first, and then if they’re not satisfied, indicate this and stop short of a full task completion attempt,” she said.
“Our final tactic for this principle is what we call few shot prompting, and this is just providing examples of successful executions of the task you want performed before asking the model to do the actual task you want it to do,” she said.
Second Principle: Give the Model Time to ‘Think’
It’s also important to give the LLM time to “think,” Fulford said.
“If a model is making reasoning errors by rushing to an incorrect conclusion, you should try reframing the query to request a chain or series of relevant reasoning before the model provides its final answer,” she said. “Another way to think about this is that if you give a model a task that’s too complex for it to do in a short amount of time, or in a small number of words, it may make up a guess which is likely to be incorrect.”
This error would also happen to a person if you asked a complex math question without time to work out the answer, she said — they would likely make a mistake.
“In these situations, you can instruct the model to think longer about a problem, which means it’s spending more computational effort on the task,” she said. For example, she asked the model to determine if a student’s answer to a math problem was correct and it determined it was — except that the answer wasn’t right.
A better approach, she explained, was to have the model itself solve the problem and then compare it with the student’s solution, an approach that indeed revealed the student had the incorrect answer.
“The model just has agreed with the student because it skim-read it in the same way that I just did,” Fulford said. “We can fix this by instructing the model to work out its own solution first, and then compare its solution to the student’s solution.”
That obviously requires a longer, more involved prompt than simply asking if the student’s answer is correct.
Some tactics for giving the model time to think:
- Specify the steps required to complete a task.
- Instruct the model to work out its own solution before rushing to a conclusion.
Reducing Hallucinations
One LLM limitation is hallucinations, which is basically when the AI makes up something that sounds plausible but isn’t actually correct.
“Even though the language model has been exposed to a vast amount of knowledge during its training process, it has not perfectly memorized the information […] and so it doesn’t know the boundary of its knowledge very well,” Fulford said. “This means that it might try to answer questions about obscure topics and can make things up that sound plausible but are not actually true.”
One way to reduce hallucinations is to ask the model to first find any relevant quotes from the text and then ask it to use those quotes to answer questions, she said. “Having a way to trace the answer back to [a] source document is often pretty helpful to reduce these hallucinations,” she added.
These are just a few key principles from the course, which also explores iterative prompt development; summarizing text a focus on specific topics; inferring sentiment and topics from product reviews and news articles; and transformation tasks such as language translation, spelling and grammar checking, tone adjustment and format conversions. The final lesson shows how to create a chatbot with OpenAI.
Each video lesson is accompanied by examples and prompts that developers can try out and tinker with on their own. There’s also a community section that you can join for further help.