Techniques for Prompt Engineering

30 Sep 2024

Driven by a report that education and training is one of the major drivers behind organizations using AI, and inspired by a paper on various prompting techniques, I’m working on a presentation for my workplace about the various ways you can prompt a generative text AI to ensure you get the content you’re looking for. Here’s a preview!

First, a bit of terminology so we’re all on the same page.

Prompt

A prompt is input used to guide the output of a generative model. The concept of prompts has been around since 2018, but has really taken off with the commercialization of ChatGPT. Prompts generally consist of text, images, or sound, but can be made up of any digitized media.

Most prompts can be one of three types, although you generally see these types used mostly with text prompts: a system prompt (instructions the model should follow about formatting, style, tone, role), assistant prompts (representing output the model has generated), and user prompts (representing input from the end user). Other systems may have other prompt types; image generation often has “negative prompts” to indicate what users don’t want to influence the generative model’s output.

Models generally have a specific prompt format as well that you should be aware of, although most consumer-facing UIs abstract this away. We discussed prompt formatting a bit in my previous article on Semantic Kernel

Prompt Components

When it comes to user and system prompts, which are generally provided by the end-user of a generative AI model, there’s a few specific components you can use to influence the results.

Directive / Command
- This is what you want the model to do. “Write a story about a dragon” or “Summarize this conversation”. Almost every well-formed prompt will have a directive.
Examples
- These are examples of what you expect the generative model to do. This helps guide the behaviour and output format of the model. Notice in the example below how we end the prompt with the input that we expect the model to classify.

Classify these statements as positive or negative: //this is the directive

"I'm happy!": Positive  //these are examples
"I'm sad": Negative
"I'm hungry": //end

Output formatting
- Guiding the expected output format of the generated response. “Respond with either Yes or No” or “Return a JSON object”. This is especially imporant when using generative AI models with programming languages, as you want the output to be easily parsable.
Style instructions
- Different from formatting, style instructions guide the tone of the response. “Explain this in terms a 5 year old would understand”.
Role
- Gives context to the generated response, and can also affect style. “You are an accomplished travel blogger” or “You are a research assistant”.
- Be careful with implied relevance and hidden biases here - the generative model interprets what information is relevant to the role based on its training data. Generating a response as an “accomplished travel blogger” may give weight to factors important to an affluent Caucasian American traveler and less weight to facts that would be important to others (as an example).
Additional Information
- Extra information the generative model can use to generate its output. Retrieval Augmented Generation is an example of this.

Prompt Parmaeters

Another important thing to be aware of are the parameters that are being passed in with your prompt. Most consumer-facing UIs abstract this away, but it’s an important factor to take advantage of.

Temperature
- Influences the randomness of the next token generated. A lower temperature will result in a less random or more predictable response
- Lower temperatures are better for controlled outputs like classification, responding with a specific value.
- Higher temperatures are better for creative outputs, like generating fictional content
Top P
- Filters the number of available options the model has to choose from when selecting the next token. Models generally come up with a list of viable options for the next token, and then decides on which one to use; Top P controls the size of that list.
- Lower Top P (closer to 0) considers fewer options
Response Length
- How long the model is allowed to continue before finishing. Models won’t necessarily always fill the response length, but it is a hard stop, often causing models to interrupt their output mid-sentence if they hit the limit.
- Important for controlling costs if needed as models generally charge per output token.
Context Length
- The maximum number of tokens the model pays attention to when generating the next token. Generally not something you can adjust, but different models have different context lengths, which may be a consideration when selecting which model to interact with.

That ended up being a whole bunch of information! I’ll continue with part two tomorrow.

generative-ai