Working with LLMs

Prompt Chaining


Learning Objectives

  • You understand what prompt chaining is and when it’s useful.
  • You can break complex tasks into sequences of simpler prompts.
  • You know how to connect prompts together effectively.

Beyond single prompts

So far, we’ve focused on crafting individual prompts — writing clear instructions, providing appropriate context, using examples effectively, and encouraging step-by-step reasoning. These techniques work well for many tasks, but some problems are simply too complex for a single prompt to handle effectively.

Consider asking a model to simultaneously analyze a customer review, extract complaints, categorize them by type, rate their severity, and generate specific actionable recommendations. Even with careful prompting, combining all these steps into one instruction often produces inconsistent results, omits important details, or generates generic responses that don’t address the specific issues identified.

Prompt chaining offers a solution: break complex tasks into a sequence of simpler prompts, where each step focuses on one clear objective and the output from one step becomes input to the next.


The basic idea

Prompt chaining works like an assembly line. Instead of asking one worker to manufacture an entire complex product alone, you have specialists at each station, each focusing on their specific task with clear inputs and outputs. Similarly, instead of one prompt trying to handle everything, you create multiple prompts, each handling a specific subtask.

A chain consists of:

  • Multiple prompts, each addressing one piece of the overall task with clear, focused instructions.

  • Data flow between prompts, where outputs from one step become inputs to the next, creating a pipeline of processing.

  • A defined sequence determining which prompt runs when, and under what conditions.

The model never sees the full complexity at once. Instead, it tackles each step individually, which generally leads to more reliable and consistent results than attempting everything simultaneously.

Loading Exercise...

A concrete example

Let’s see how this works with customer review analysis. Suppose we have this review:

“The package arrived 3 days late, which was frustrating. When I opened it, the box was crushed and the product inside had visible scratches. I tried contacting customer service but didn’t hear back for 48 hours. The product itself works fine once I set it up, but the whole experience was disappointing.”

Attempting a single prompt

We could try asking the model to do everything at once:

Analyze this review, extract all complaints, categorize them, rate their
severity, and provide recommendations: [review text]

This often produces inconsistent results. The model might miss some complaints, mix up categories, apply inconsistent severity ratings, or generate generic recommendations that don’t address the specific issues identified.

Using a chain instead

Let’s break this into four focused steps:

Step 1: Extract complaints

Read this customer review and list all specific complaints mentioned.
Focus on concrete issues, not general sentiment.

Review: [review text]

List one complaint per line.

The model responds:

- Delivery was 3 days late
- Package box was crushed
- Product had visible scratches
- Customer service didn't respond for 48 hours

Step 2: Categorize complaints

Now we take these extracted complaints and categorize them:

Categorize each complaint into one category: Delivery, Product Quality,
Customer Service, Pricing, or Other.

Complaints:
- Delivery was 3 days late
- Package box was crushed
- Product had visible scratches
- Customer service didn't respond for 48 hours

Format: [Complaint] | [Category]

The model responds:

- Delivery was 3 days late | Delivery
- Package box was crushed | Delivery
- Product had visible scratches | Product Quality
- Customer service didn't respond for 48 hours | Customer Service
Loading Exercise...

Step 3: Rate severity

Take the categorized complaints and assess their likely impact:

Rate each complaint's severity as Low, Medium, or High based on likely
impact on customer satisfaction.

Complaints:
- Delivery was 3 days late | Delivery
- Package box was crushed | Delivery
- Product had visible scratches | Product Quality
- Customer service didn't respond for 48 hours | Customer Service

Format: [Complaint] | [Category] | [Severity]

The model responds:

- Delivery was 3 days late | Delivery | Medium
- Package box was crushed | Delivery | Medium
- Product had visible scratches | Product Quality | High
- Customer service didn't respond for 48 hours | Customer Service | High
Loading Exercise...

Step 4: Generate recommendations

Finally, use the analyzed data to create specific, actionable recommendations:

Based on these analyzed complaints, provide 2-3 specific recommendations
for improvement. Prioritize high-severity issues.

Analyzed complaints:
- Delivery was 3 days late | Delivery | Medium
- Package box was crushed | Delivery | Medium
- Product had visible scratches | Product Quality | High
- Customer service didn't respond for 48 hours | Customer Service | High

The model responds:

1. Implement protective packaging standards to prevent product damage,
   including corner reinforcements and interior padding
2. Establish a 24-hour response time standard for customer service, with
   automatic escalation for missed deadlines
3. Review shipping partner performance and establish backup carriers for
   critical deliveries

Each step is simpler and more focused than trying to do everything simultaneously. The chain produces more reliable results because the model can concentrate on one clear objective at each stage.

Loading Exercise...