Planning Agents
A Planning Agent is an AI Agent that, before taking any action, creates a structured plan of the steps needed to complete a task. Instead of figuring out the next step one at a time, a planning agent thinks about the full picture upfront — then executes each step in sequence.
Planning agents are essential for complex, multi-step tasks where getting the sequence of actions right matters.
Why Planning Is Important
Consider this task: "Research the top 5 AI companies, summarise each one's latest product, and create a comparison table."
Without Planning (Reactive Agent)
Step 1: Searches "top AI companies" → gets a list Step 2: Randomly picks one, searches for its product Step 3: Realises it needs all 5 — searches again Step 4: Gets confused about what's been done Step 5: Misses some companies, duplicates others Result: Incomplete, disorganised output
With Planning (Planning Agent)
Plan created upfront:
1. Search for the top 5 AI companies
2. For each company (loop):
a. Search for their latest product
b. Summarise the product in 2-3 sentences
3. Create a comparison table from all summaries
4. Present the final table
Execution: Each step runs in order → Complete, structured output
Two Main Planning Architectures
Architecture 1 — Plan-and-Execute
The agent creates the full plan first, then executes each step. The plan is generated before any tools are called.
Phase 1 — PLAN: LLM receives the task and generates a step-by-step plan Plan: ["Step 1: ...", "Step 2: ...", "Step 3: ..."] Phase 2 — EXECUTE: Agent loops through each step in the plan For each step → calls appropriate tools → stores results Phase 3 — SYNTHESIZE: Agent combines all results into a final answer
Architecture 2 — Dynamic Planning (ReAct + Planning)
The agent creates an initial plan but can update it dynamically based on what it discovers during execution. More flexible but harder to implement.
Initial Plan: ["Step 1", "Step 2", "Step 3"] After Step 1: Agent discovers Step 2 needs to be split into two steps Updated Plan: ["Step 1 ✓", "Step 2a", "Step 2b", "Step 3"]
Implementing a Plan-and-Execute Agent
# planning_agent.py
import os
import json
from dotenv import load_dotenv
import openai
from tools import TOOL_MAP, TOOL_DEFINITIONS
load_dotenv()
client = openai.OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
# ─── Step 1: Planner ──────────────────────────────────────────────
PLANNER_PROMPT = """You are a task planner.
Break down the given task into clear, numbered steps.
Each step should be a specific, actionable instruction.
Format the plan as a JSON array of strings.
Example:
Task: Find the price of iPhone 15 in India and convert it to USD
Plan: [
"Search for iPhone 15 price in India",
"Get the current INR to USD exchange rate",
"Calculate the USD equivalent of the INR price",
"Present both prices clearly"
]
Only output the JSON array. Nothing else."""
def create_plan(task: str) -> list:
"""Use the LLM to create a step-by-step plan for the task."""
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": PLANNER_PROMPT},
{"role": "user", "content": f"Task: {task}"}
],
temperature=0.1,
max_tokens=500
)
plan_text = response.choices[0].message.content
# Strip markdown code fences if present
plan_text = plan_text.replace("```json", "").replace("```", "").strip()
plan = json.loads(plan_text)
return plan
# ─── Step 2: Executor ─────────────────────────────────────────────
EXECUTOR_PROMPT = """You are a task executor.
Execute the given step using the available tools if needed.
If a tool is required, call it.
Return the result of executing this step clearly and concisely.
Available tools: web_search, calculate"""
def execute_step(step: str, context: str) -> str:
"""Execute a single step in the plan using tools if needed."""
messages = [
{"role": "system", "content": EXECUTOR_PROMPT},
{"role": "user", "content": f"""
Context from previous steps:
{context}
Current step to execute: {step}
Execute this step now."""}
]
for _ in range(3): # Up to 3 tool calls per step
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=TOOL_DEFINITIONS,
tool_choice="auto",
temperature=0.2,
max_tokens=600
)
message = response.choices[0].message
messages.append(message)
if message.tool_calls:
for tool_call in message.tool_calls:
tool_name = tool_call.function.name
tool_args = json.loads(tool_call.function.arguments)
if tool_name in TOOL_MAP:
result = TOOL_MAP[tool_name](**tool_args)
else:
result = json.dumps({"error": "Unknown tool"})
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": result
})
else:
return message.content # Step complete
return "Step could not be completed"
# ─── Step 3: Synthesizer ──────────────────────────────────────────
SYNTHESIZER_PROMPT = """You are a synthesis expert.
Given a task and the results from executing each step of a plan,
create a clear, well-structured final answer for the user.
Make the response concise and actionable."""
def synthesize_results(task: str, plan: list, results: list) -> str:
"""Combine all step results into a final coherent answer."""
step_results = "\n".join([
f"Step {i+1}: {plan[i]}\nResult: {results[i]}"
for i in range(len(plan))
])
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": SYNTHESIZER_PROMPT},
{"role": "user", "content": f"""
Original Task: {task}
Step-by-Step Results:
{step_results}
Please synthesise these results into a final, clear answer."""}
],
temperature=0.3,
max_tokens=800
)
return response.choices[0].message.content
# ─── Main Planning Agent ──────────────────────────────────────────
def run_planning_agent(task: str) -> str:
print(f"\n{'='*55}")
print(f"Task: {task}")
print(f"{'='*55}")
# Phase 1: Create plan
print("\n📋 CREATING PLAN...")
plan = create_plan(task)
for i, step in enumerate(plan):
print(f" {i+1}. {step}")
# Phase 2: Execute each step
print("\n⚙️ EXECUTING STEPS...")
results = []
context = ""
for i, step in enumerate(plan):
print(f"\n Executing Step {i+1}: {step}")
result = execute_step(step, context)
results.append(result)
context += f"\nStep {i+1} ({step}): {result}"
print(f" Result: {result[:200]}...")
# Phase 3: Synthesize
print("\n✅ SYNTHESIZING FINAL ANSWER...")
final_answer = synthesize_results(task, plan, results)
print(f"\n{'='*55}")
print("FINAL ANSWER:")
print(final_answer)
print(f"{'='*55}\n")
return final_answer
# Test
if __name__ == "__main__":
run_planning_agent(
"What is LangChain and what are its three main use cases?"
)
Sample Output
=======================================================
Task: What is LangChain and what are its three main use cases?
=======================================================
📋 CREATING PLAN...
1. Search for what LangChain is
2. Search for LangChain main use cases
3. Summarise and combine the findings
⚙️ EXECUTING STEPS...
Executing Step 1: Search for what LangChain is
Result: LangChain is an open-source framework for building applications with LLMs...
Executing Step 2: Search for LangChain main use cases
Result: LangChain is commonly used for: RAG (retrieval-augmented generation),
AI Agents, and automated pipelines...
Executing Step 3: Summarise and combine the findings
Result: Combined summary prepared...
✅ SYNTHESIZING FINAL ANSWER...
FINAL ANSWER:
LangChain is an open-source Python framework that simplifies building applications
powered by Large Language Models (LLMs).
Its three main use cases are:
1. RAG (Retrieval-Augmented Generation): Connecting LLMs to external documents
and knowledge bases for accurate, grounded answers.
2. AI Agents: Building autonomous agents that can reason, plan, and use tools.
3. Automated Pipelines: Chaining multiple LLM calls, tools, and logic into
automated workflows with minimal code.
When to Use a Planning Agent
| Use Planning Agent When... | Use Simple ReAct Agent When... |
|---|---|
| Task has 4+ distinct steps | Task can be done in 1–3 tool calls |
| Order of steps matters critically | Task is open-ended exploration |
| Result quality depends on structure | Real-time responsiveness is needed |
| Task needs parallel execution | Task is conversational |
Summary
Planning Agents break complex tasks into structured, executable steps before taking any action. The Plan-and-Execute architecture separates task planning from execution, making agents more organised and reliable. By using a dedicated Planner LLM call, an Executor for each step, and a Synthesizer for the final output, complex, multi-part tasks can be completed reliably. Planning Agents are the backbone of production-grade AI automation workflows.
