Hey, I'm Alexandra.
Let's build a better web.

back to the blog

July 18, 2024 Professional WritingWork

Improve your LLM output with prompt engineering

As generative AI has become more widely available, we’ve seen how widely the quality of results can vary. In fact, there are several components that impact result quality, such as which large language model (LLM) you’re using, model size, and prompt quality.

As a technical writer, quality prompt writing is pretty close to a summary of a key job component: ask the right questions of the right people to get the best possible information to shape what I produce. I go to the subject matter expert (SME), who may be an engineer, a product manager, a UX researcher, or another technical writer.

In my day-to-day work for Google Chrome, some of my questions could be asked of an LLM, but that requires the model have pre-existing information or to be given the context within the prompt itself.

Prompt quality is impacted by the level of details provided, initial information accuracy, description of expected format, and so much more. Prompt engineering is the practice of asking better questions to generate the best possible response for your needs.

If you’re wondering how to be a better prompt engineer (or a better tech writer!), keep reading to learn best practices.

In the following examples, I’ve used Gemini. Your prompt results may vary if replicated, even if you use the same model.

Writing technique

Before we dive into more advanced techniques, let’s review the type of prompting you may already do: direct prompting, sometimes referred to as zero-shot prompting.

Direct prompting provides only instruction to the model, possibly some context, and no example output. You could use direct prompting to generate ideas, organize data, or engage in conversation.

For example: “Can you share a definition of software engineering for a five-year-old child?”

Or, “Create a table of the top five programming languages, average rate of adoption by engineers, and top sites using that language.”

You could take that copy and paste it into Gemini or ChatGPT’s web interface. Or, you could use your own implementation of an AI model, such as Gemini on the cloud. For example, let’s say you want to prompt Gemini with Node.js and receive streamed responses.

const role = 'user';
const promptText = 'Create a table of the top five programming languages, average rate of adoption by engineers, and top sites using that language.';
async function streamGenerateContent() {
  const request = {
    contents: [{role: role, parts: [{text: promptText}]}],
  };
  const streamingResult = await generativeModel.generateContentStream(request);
  for await (const item of streamingResult.stream) {
    console.log('stream chunk: ', JSON.stringify(item));
  }
  const aggregatedResponse = await streamingResult.response;
  console.log('aggregated response: ', JSON.stringify(aggregatedResponse));
};
streamGenerateContent();

Prompt with examples (multi-shot)

Examples are often useful context for an LLM to create a result that you find useful. A shot is an example; one-shot is one example, few-shot is a few examples, and so-on. Zero-shot prompting is, as you’d expect, zero examples.

Multi-shot prompting is particularly helpful when you need a result to follow a certain pattern or be output in a specific format, such as styled in markdown.

In the following one-shot prompt, we ask Gemini to rate the difficulty of three different types of engineering jobs. To ensure we get the right output, the example provided offers both a specific format and structure for rating and reasoning.

Analyze three different types of engineer (civil engineer, software engineer, and aerospace engineer). Rate these jobs for difficulty based on education requirements, job availability, and job happiness. The rating should be an integer between 1 and 5. 1 is the least difficult, and 5 is the most difficult. The output should include the title with a list of ratings. For example: 
"Chemical Engineer.
* Education requirements: 5.
* Job availability: 2.
* Job happiness: 3.
* Overall difficulty: 4"

The output from Gemini is structured as requested.

Civil Engineer
* Education requirements: 4.
* Job availability: 3.
* Job happiness: 3.
* Overall difficulty: 3
Software Engineer
* Education requirements: 3.
* Job availability: 5.
* Job happiness: 4.
* Overall difficulty: 3
Aerospace Engineer
* Education requirements: 5.
* Job availability: 2.
* Job happiness: 4.
* Overall difficulty: 4

Chain-of-thought prompting

When you use Chain of Thought (CoT) prompts, you encourage the LLM to explain why it gave the response. In combination with multi-shot prompting, you’re likely to generate better results for complicated tasks.

In the previous example, we asked for difficulty ratings for different engineering jobs. But why did our model rate the difficulty in that way? Let’s find out.

Analyze three different types of engineer (civil engineer, software engineer, and aerospace engineer). Rate these jobs for difficulty based on education requirements, job availability, and job happiness. The rating should be an integer between 1 and 5. 1 is the least difficult, and 5 is the most difficult. The output should be in list format and offer a reason for the rating. For example: 
"Chemical Engineer.
* Education requirements: 5. This job requires an advanced degree and in-depth understanding of chemistry and physics.
* Job availability: 2. Job availability can be more limited, often tied to specific industries.
* Job happiness: 3. Job satisfaction can vary based on the specific role and company.
* Overall difficulty: 4"

The output from Gemini is structured as requested with the job title on top and a bullet list of difficulty ratings with reasons for those rating.

Civil Engineer
* Education requirements: 4. Typically requires a bachelor's degree in civil engineering, with advanced degrees often needed for specialized roles.
* Job availability: 3. Generally good job prospects due to the constant need for infrastructure development and maintenance, but can vary by location.
* Job happiness: 3. Job satisfaction can vary based on project type and work-life balance, but there's often a sense of contributing to the community.
* Overall difficulty: 3
Software Engineer
* Education requirements: 3. While a degree is beneficial, many enter the field with coding bootcamps or self-taught skills.
* Job availability: 5. High demand for software engineers across various industries, leading to numerous job opportunities.
* Job happiness: 4. Often involves problem-solving, creativity, and continuous learning, leading to high job satisfaction.
* Overall difficulty: 3
Aerospace Engineer
* Education requirements: 5. Typically requires an advanced degree (master's or PhD) due to the complexity of the field and high-level mathematical and scientific knowledge needed.
* Job availability: 2. Can be competitive due to the specialized nature of the work and the reliance on government and aerospace industry contracts.
* Job happiness: 4. Working on cutting-edge technology and contributing to space exploration can be highly rewarding.
* Overall difficulty: 4

Customize your model

The questions you ask and the context you offer is the most important work you do to generate an output successfully. But, that’s not the only tool at your disposal. You can also influence the model generation process itself.

Prompting occurs in a context window, which is made up of tokens. Tokens can be entire, parts, or subsections of words, images, videos, audio, or code. As the number of tokens available increases, the volume of information the model can process in a given prompt also increases, leading to a more consistent, relevant, and useful response.

For example, Gemini 1.5 Pro can run up to 1 million tokens in production.

When running the model, you can adjust the temperature and topK, which directly affect the tokens and thus influence the output. With Gemini, you can do so by modifying generationConfig.

Temperature controls the degree of randomness in token selection. Lower temperatures are good for prompts that require a less open-ended or less creative response, while higher temperatures can lead to more diverse or creative results If your model returns a response that’s too generic, short, or a default response, try increasing the temperature.

Top-K changes how the model selects tokens for the output. Specify a lower value for less random responses and a higher value for more random responses.

Iterate, iterate, iterate

Ultimately, you’ll write better prompts by rewriting your prompts. Like most skills, you’ll improve with practice. And, it’s likely models will also improve with time and gain better understanding of what results are expected for different types of request.

At this point, models cannot read your mind, so if you don’t define the type of output you want, you can’t expect to always get a perfect answer!

If you’re stuck on how to iterate and want to improve your results, you may find the following tips helpful:

  • Name the preferred output format (such as Markdown, JSON, CSV). You can also include an example formatted more specifically.
  • Use synonyms to retrieve different results. For example, instead of asking for a summary, ask for the key takeaways.
  • Repeat yourself for emphasis. If you want to make sure the model knows some text is important, repeating a sentence may help.

If copy changes fail, you can always try a different model. It’s possible another model may provide results closer to your expected output. That doesn’t necessarily mean the results themselves are better or worse, but perhaps some models are better able to interpret your expected output.

Next steps

At its core, prompt engineering is asking quality questions. And with those skills, you’re better equipped to do just about anything. If you’re looking for some ideas for how to use generative AI, the Vertex AI prompt gallery is a great place to start.

If you’ve found any other strategies to get better outputs or found any libraries particularly helpful, share them with me on BlueSky at @alexandrascript.com or with my team on X at @ChromiumDev.