- Published on
Learning Agentic AI End-to-End: Prompt Chaining
- Authors

- Name
- Eder Ignatowicz
- @ederign
This post is part of the Learning Agentic AI End-to-End series.
Repository with samples: github.com/ederign/Agentic-AI-end-to-end
Note: These findings emerged from collaborative exploration with Claude (Anthropic), including hands-on implementation and documentation research.
What is Prompt Chaining?
Prompt chaining is the simplest agentic pattern: a sequence of LLM calls where the output of one step becomes the input for the next. It's the foundation for more complex patterns.
Input → [Prompt 1] → Output₁ → [Prompt 2] → Output₂ → ... → Final Result
Why Prompt Chaining?
While a single, detailed prompt can work for simpler tasks, complex workflows benefit from breaking work into sequential steps. This divide-and-conquer approach offers several advantages:
- Simpler prompts - Each step has a focused, well-defined task
- Easier debugging - You can inspect intermediate outputs between steps
- Better accuracy - Smaller, focused prompts tend to produce more reliable results
- Structured outputs - Each step can enforce specific output formats (JSON, bullet points, etc.)
Prompt chaining is particularly effective when tasks have clear sequential dependencies—where Step 2 genuinely needs the output of Step 1 to proceed.
Use Cases
- Data extraction and transformation - Extract info → Clean → Format
- Multi-stage reasoning - Analyze → Critique → Refine
- Content generation pipelines - Outline → Draft → Edit → Polish
- Validation workflows - Generate → Validate → Fix if needed
Our Example: Specs Extraction
This example is adapted from the book's prompt chaining chapter. All three implementations solve the same problem:
Input: "The laptop has a 3.5 GHz octa-core processor, 16GB RAM, and 1TB NVMe SSD"
↓
[Step 1: Extract] → Bullet points of specs
↓
[Step 2: Transform] → Structured JSON
↓
Output: {"cpu": "3.5 GHz octa-core", "memory": "16GB", "storage": "1TB NVMe SSD"}
Approach 1: OpenAI APIs (LlamaStack)
LlamaStack provides building blocks (OpenAI-compatible API), not orchestration. Prompt chaining is implemented manually by calling the API sequentially.
Why the Responses API?
From OpenAI's blog:
"Something as approachable as Chat Completions, as powerful as Assistants, but also purpose built for multimodal and reasoning models."
LlamaStack implements the OpenAI-compatible Responses API, which means:
- You get the same powerful API for agentic workflows
- But with model freedom, data sovereignty, and no vendor lock-in
Implementation
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8321/v1", api_key="none")
MODEL = "openai/gpt-4o-mini"
# Step 1: Extract specifications
extraction_response = client.responses.create(
model=MODEL,
input=(
"Extract the technical specifications from the following text:\n\n"
f"{text_input}\n\n"
"Return only the extracted specs as concise bullet points."
),
)
specifications = extraction_response.output_text
# Step 2: Transform to JSON (using output from step 1)
transform_response = client.responses.create(
model=MODEL,
input=(
"Transform the following specifications into a JSON object with "
"'cpu', 'memory', and 'storage' as keys:\n\n"
f"{specifications}\n\n"
"Return only valid JSON, no markdown or extra text."
),
)
json_text = transform_response.output_text
Strengths
- No framework lock-in - Standard OpenAI client
- Model portability - Swap providers via config, not code
- Data sovereignty - Run entirely in your infrastructure
- Simple code - Just API calls, no abstractions to learn
Weaknesses
- Manual orchestration - No built-in chaining abstractions
- More boilerplate - Must handle data passing explicitly
- Scaling complexity - Complex patterns require more code
Running
make llama-server # Start LlamaStack first
make prompt-chaining-raw
Approach 2: LangChain + LlamaStack
LangChain uses LCEL (LangChain Expression Language) to compose chains declaratively using the | operator, with LlamaStack as the infrastructure backend.
Key Concepts
- ChatPromptTemplate - Defines prompt templates with variables
- StrOutputParser - Converts LLM output to string
- LCEL Pipe (
|) - Chains components together - Dictionary mapping - Passes outputs to named variables
Implementation
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
base_url="http://localhost:8321/v1", # LlamaStack
api_key="none",
model="openai/gpt-4o-mini",
temperature=0,
)
# Step 1: Extract specifications
prompt_extract = ChatPromptTemplate.from_template(
"Extract the technical specifications from the following text:\n\n{text_input}"
)
# Step 2: Transform to JSON
prompt_transform = ChatPromptTemplate.from_template(
"Transform the following specifications into a JSON object with "
"'cpu', 'memory', and 'storage' as keys:\n\n{specifications}"
)
# Build chain using LCEL
extraction_chain = prompt_extract | llm | StrOutputParser()
full_chain = (
{"specifications": extraction_chain} # Output becomes 'specifications' variable
| prompt_transform
| llm
| StrOutputParser()
)
# Execute
result = full_chain.invoke({"text_input": "The laptop has 16GB RAM..."})
Strengths
- Declarative syntax - Easy to read and understand flow
- Automatic data passing - LCEL handles variable mapping
- Composable - Chains can be nested and combined
- Streaming support - Built-in streaming capabilities
Weaknesses
- Framework lock-in - Code tied to LangChain abstractions
- Magic behavior - Data flow can be non-obvious for complex chains
- Learning curve - Need to understand LCEL semantics
Running
make llama-server # Start LlamaStack first
make prompt-chaining-langchain
Approach 3: ADK (Google Agent Development Kit)
ADK uses SequentialAgent to chain multiple LlmAgent instances. State is passed between agents via a shared state dictionary. Uses LiteLLM as infrastructure.
Key Concepts
- LlmAgent - Individual agent with model, instruction, and output configuration
- SequentialAgent - Runs sub-agents in order
- State dictionary - Shared context between agents
- output_key - Where agent stores its result in state
- output_schema - Pydantic model for structured output
Why Dynamic Instructions? (Lazy Evaluation)
ADK only interpolates {variables} from the initial state. Values added by previous agents aren't automatically available in template strings. The solution is to pass a function that ADK calls when the agent runs:
def build_transform_instruction(ctx: ReadonlyContext) -> str:
specs = ctx.state.get("specifications", "") # Read at run time
return f"Transform to JSON:\n\n{specs}"
Implementation
from google.adk.agents import LlmAgent, SequentialAgent
from google.adk.runners import InMemoryRunner
from pydantic import BaseModel, Field
MODEL = "openai/gpt-4o-mini"
class SpecsJson(BaseModel):
cpu: str = Field(description="CPU details")
memory: str = Field(description="RAM size")
storage: str = Field(description="Storage details")
# Step 1: Extract agent
extract_agent = LlmAgent(
name="extract_specs_agent",
model=MODEL,
instruction=(
"Extract the technical specifications from the following text:\n\n"
"{user_text}\n\n"
"Return only the extracted specs as concise bullet points."
),
output_key="specifications",
)
# Step 2: Transform agent
def build_transform_instruction(ctx):
specs = ctx.state.get("specifications", "")
return f"Transform to JSON with cpu, memory, storage keys:\n\n{specs}"
transform_agent = LlmAgent(
name="transform_specs_agent",
model=MODEL,
instruction=build_transform_instruction,
output_schema=SpecsJson,
output_key="result",
)
# Build pipeline
pipeline = SequentialAgent(
name="specs_pipeline",
sub_agents=[extract_agent, transform_agent],
)
Strengths
- Structured agents - Clear agent definitions with explicit configuration
- Type safety - output_schema with Pydantic validation
- State management - Built-in state passing between agents
- Reusable agents - Agents can be composed into different pipelines
Weaknesses
- More boilerplate - Async setup, sessions, runners required
- Complex execution - Need to manage sessions and async flow
- Learning curve - More concepts to understand (agents, runners, sessions)
Running
make prompt-chaining-adk # No LlamaStack needed, uses LiteLLM
Comparison
Summary Table
| Aspect | OpenAI APIs | LangChain | ADK |
|---|---|---|---|
| Infrastructure | LlamaStack | LlamaStack | LiteLLM |
| Orchestration | Manual | LCEL chains | SequentialAgent |
| Data passing | Manual | Automatic via dict | State dictionary |
| Async required | No | No | Yes |
| Framework lock-in | None | High | High |
Code Comparison
# OpenAI APIs - Explicit API calls
response1 = client.responses.create(model=MODEL, input=prompt1)
response2 = client.responses.create(model=MODEL, input=f"{prompt2}\n{response1.output_text}")
# LangChain - Declarative pipeline
chain = {"specs": prompt1 | llm | parser} | prompt2 | llm | parser
result = chain.invoke({"text_input": input})
# ADK - Structured agents
pipeline = SequentialAgent(sub_agents=[extract_agent, transform_agent])
async for event in runner.run_async(...): ...
Recommendation Matrix
| Scenario | Recommended Approach |
|---|---|
| Quick prototype | LangChain |
| Production with model flexibility | LangChain + LlamaStack |
| Data sovereignty requirements | OpenAI APIs (LlamaStack) |
| Multi-agent systems | ADK |
| Minimal dependencies | OpenAI APIs (LlamaStack) |
Running All Approaches
# Start LlamaStack (required for OpenAI APIs and LangChain)
make llama-server
# Run individual approaches
make prompt-chaining-raw # OpenAI APIs
make prompt-chaining-langchain # LangChain
make prompt-chaining-adk # ADK
# Compare all three
make prompt-chaining-all
Code Location
packages/patterns/src/patterns/prompt_chaining/raw.py- OpenAI APIspackages/patterns/src/patterns/prompt_chaining/langchain.py- LangChainpackages/patterns/src/patterns/prompt_chaining/adk.py- ADK
References
- Agentic Design Patterns by Antonio Gullí - The book that serves as the basis for this project
- Anthropic - Building Effective Agents
- OpenAI - Why We Built the Responses API
- LlamaStack GitHub
- LangChain Documentation
- Google ADK Documentation
- Red Hat - Responses API Deep Dive