Prompting and Prompt Engineering

Providing Reference Text


Learning Objectives

  • You know that the outputs of large language models can be constrained to given input text.
  • You know that large language models can be given examples of output formatting.
  • You know of the term in-context learning.

So far, our prompting has been limited to prompts where we ask the model to generate text. We’ve done some prompt engineering, looked into contextualization of the prompts, and asked for output delivered by a persona or targeted for specific populations. We haven’t however looked into providing reference text for the model to generate content.

Answering based on given text

Large language models can be given text based on which they generate content. As an example, we can ask the large language model to answer questions based on given text, and — if the information is not available — provide an answer that outlines that the information is not available.

As an example, we can take our earlier imaginary story of “Arton kahvikuppi”, and ask the model to answer a question based on the story.

Using the text after ###, answer the question "What emerged from cosmic ether?". If there is insufficient information in the text, answer with the text "Insufficient information."

###

In the beginning, the cosmos was but a vast expanse of nothingness, until Arton Kahvikuppi emerged from the cosmic ether. Arton, a being of immense power and wisdom, willed the universe into existence with a single thought. Planets coalesced, stars ignited, and life began to flourish under its guiding influence.

Arton Kahvikuppi emerged from the cosmic ether.

We could also use the same text to ask a question that cannot be answered based on the text. Although the model could potentially generate an answer, it would be based on the training data, not the text provided, and thus the model does not provide an answer.

Using the text after ###, answer the question "What is the capital of Finland?". If there is insufficient information in the text, answer with the text "Insufficient information."

###

In the beginning, the cosmos was but a vast expanse of nothingness, until Arton Kahvikuppi emerged from the cosmic ether. Arton, a being of immense power and wisdom, willed the universe into existence with a single thought. Planets coalesced, stars ignited, and life began to flourish under its guiding influence.

Insufficient information.

In a similar fashion, we could ask the model to summarize the text provided.

Summarize the text after ### in ten words.

###

In the beginning, the cosmos was but a vast expanse of nothingness, until Arton Kahvikuppi emerged from the cosmic ether. Arton, a being of immense power and wisdom, willed the universe into existence with a single thought. Planets coalesced, stars ignited, and life began to flourish under its guiding influence.

Arton Kahvikuppi created the universe from nothing with wisdom.

The ability to constrain the output to the input text is useful as it allows limiting the output to the existing information, and omitting the generation of new information. This can be useful in many applications, such as in summarization or question answering.

Retrieval augmented generation

Retrieval augmented generation systems also answer questions based on additional text. When a user enters a prompt, a retrieval augmented generation system first searches for relevant information based on the prompt from a database or from some other source, and then adds the found information to the prompt, sending the prompt with the found information to a large language model.


Loading Exercise...

Providing example outputs

While above, we constrained the output to the input text, we can also provide reference output for content generation. This is done by providing examples of the output we wish to see, and asking the model to generate text based on these examples.

Zero-shot prompting

Zero-shot prompting refers to generating text without training the model for the specific task at hand, basing the output on the instructions describing the task (the models are still pre-trained and fine-tuned though). The following example outlines a zero-shot prompt, where we ask the model to translate words given to it.

Translate the following Finnish words into English.
Words: kahvi, tee, vesi

Words:

kahvi -> coffee
tee -> tea
vesi -> water

When we try the prompt again, we see a similar output, but not exactly the same. This is because the model is generating the output from scratch, and the output is not deterministic.

Translate the following Finnish words into English.
Words: kahvi, tee, vesi

Words:

kahvi - coffee
tee - tea
vesi - water

Imagine needing the output of the large language model in a very specific format that could be used as an input for another system. As an example, we might want to have the output formatted so that each row starts with a hashtag, the word and the translation are separated with a hashtag, and each row ends with a hashtag and a smiley.

That is, the output should be something like the following:

#kahvi#coffee# :)
#tee#tea# :)
#vesi#water# :)

The above format is not CSV or a commonly known format, as large language models are rather good at producing them simply based on the prompt. Probably, with enough contextualization, the above could also be created.

One-shot prompting

One-shot prompting refers to providing the model an example of the output we wish to see, effectively asking it to generate text based on that example. By providing an example, we are guiding the model on what expect the output to look like.

In the following, we first outline the task and then words, followed by a keyword “Output:”, and an example of the translated words with the above formatting. This is followed by another set of words and the keyword “Output:”. As we can see, the model picks up the formatting from the prompt and generates the output accordingly.

Translate the following Finnish words into English.

Words: yksi, kaksi
Output:
#yksi#one# :)
#kaksi#two# :)

Words: kahvi, tee, vesi
Output:

#kahvi#coffee# :)
#tee#tea# :)
#vesi#water# :)

Depending on the capabilities of the model, we can also use it to extract information from data. In the following example, we extract descriptive statistics from a provided sentence, asking the model to extract the number of words and the longest word, and to output them in a specific format.

Provide the number of words and the longest word for the following sentences:

Sentence: I bought a hamburger.
Output:
Words: 4
Longest word: hamburger

Sentence: This makes plenty of sense.
Output:

Sentence: This makes plenty of sense.
Words: 5
Longest word: plenty

As you notice from the above output, the model may not always behave exactly as we wish. Above, the output also contains the sentence, which we did not ask for.

Few-shot prompting

Few-shot prompting extends the idea of one-shot prompting, providing the model with a few (more than one) examples of the output we wish to see. The following example continues on the previous text statistics task, providing more examples as a part of the prompt.

Provide the number of words and the longest word for the following sentences:

Sentence: I bought a hamburger.
Output:
Words: 4
Longest word: hamburger

Sentence: This makes plenty of sense.
Output:
Words: 5
Longest word: plenty

Sentence: Recursion in bugs -- another day in parasite.
Output:

Words: 8
Longest word: recursion


Loading Exercise...

Why the 'Output:'?

The keyword “Output:” is not necessary, but it helps the model to structure the output. We could as well use hashtags (or other tokens) in the prompt to distinguish parts in the output.

The following demonstrates the same task, but with hashtags separating the inputs and outputs.

Provide the number of words and the longest word for the following sentences:
#####
Sentence: I bought a hamburger.
#####
Words: 4
Longest word: hamburger
#####
Sentence: This makes plenty of sense.
#####
Words: 5
Longest word: plenty
#####
Sentence: Recursion in bugs -- another day in parasite.
#####

Words: 8
Longest word: recursion

In-context learning

The above examples outline the use of in-context learning. In-context learning — like in one-shot and few-shot prompting — is a form of temporary learning where the model learns to aligns the output with the provided input. The temporariness comes from the fact that the model does not change, and the examples influence only the outputs of the present prompt.