Memory in AI Agents
By default, every time a new conversation starts with an LLM, it begins completely fresh — it remembers nothing from past sessions. For a simple question-answer tool, this is fine. But for an AI Agent that needs to learn about a user over time, track ongoing tasks, or recall past interactions, memory is absolutely essential.
Memory is what gives an agent the ability to "remember" — making it feel persistent, intelligent, and context-aware.
Why Memory Matters
Consider this conversation without memory:
Session 1: User: "My name is Arjun and I prefer responses in Hindi." Agent: "Sure Arjun, I'll respond in Hindi from now on." Session 2 (next day): User: "Can you summarise yesterday's meeting notes?" Agent: "Sure! Who are you and what notes would you like me to summarise?"
The agent forgot everything. Now with memory:
Session 2 (next day): Agent: "Welcome back, Arjun! यहाँ कल की मीटिंग का सारांश है..."
Memory makes the agent genuinely useful over time.
The Four Types of Agent Memory
Type 1 — In-Context Memory (Short-Term)
In-Context Memory is the simplest form. It is the entire conversation history stored within the LLM's context window during a single session.
Every message (user + agent) is included in the prompt sent to the LLM. This gives the agent "short-term memory" for the current conversation.
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "My name is Priya."},
{"role": "assistant", "content": "Nice to meet you, Priya!"},
{"role": "user", "content": "What is my name?"}, # ← New message
]
# LLM can see entire history → answers "Your name is Priya"
Strengths and Limitations
| Strengths | Limitations |
|---|---|
| Simple to implement | Lost when the session ends |
| Works automatically | Limited by context window size |
| No storage needed | Very long conversations use more tokens (higher cost) |
Type 2 — External Memory (Long-Term)
External Memory stores information outside the LLM — in a database, file system, or vector store. This memory persists across sessions.
When a new session starts, relevant memories are retrieved from the database and injected into the prompt, so the agent "remembers" past context.
Architecture:
[User says something important]
↓
[Agent saves it to memory database]
↓
[Next session starts]
↓
[Agent retrieves relevant memories]
↓
[Memories injected into system prompt]
↓
[Agent responds with full context]
Example
Memory Database (stored between sessions):
{
"user_id": "arjun_001",
"memories": [
"Prefers responses in Hindi",
"Works as a software engineer",
"Interested in machine learning",
"Asked about Python courses on 14 Jan"
]
}
New session system prompt:
"You are a helpful assistant. Here is what you know about this user:
- Prefers responses in Hindi
- Software engineer interested in machine learning"
Strengths and Limitations
| Strengths | Limitations |
|---|---|
| Persists across sessions | Requires extra storage infrastructure |
| Scales to large amounts of memory | Need logic to decide what to store |
| Can store user preferences, history | Retrieving the right memory at the right time is complex |
Type 3 — Episodic Memory
Episodic Memory stores specific events or experiences — like a diary. The agent records what happened, when it happened, and what the outcome was.
Episode stored:
{
"date": "2024-01-15",
"task": "Helped user debug a FastAPI authentication error",
"approach": "Checked JWT token expiry, found 1-hour limit was too short",
"outcome": "Resolved by increasing token expiry to 24 hours",
"user_rating": 5
}
This type of memory is useful for agents that need to learn from past experience — like a support agent that gets better over time by remembering what solutions worked.
Type 4 — Semantic Memory (Knowledge Base)
Semantic Memory stores facts, rules, and general knowledge — not tied to any specific conversation. This is typically stored in a vector database and retrieved using semantic search.
Knowledge Base Entries:
- "The return policy allows returns within 7 days of delivery."
- "Premium members get free shipping on all orders."
- "Order cancellations are only allowed within 2 hours of placing."
Agent receives question: "Can I cancel an order I placed 3 hours ago?"
Agent searches knowledge base → Finds the cancellation policy
Agent responds: "Unfortunately, order cancellations are only allowed
within 2 hours of placing the order."
Vector Databases for Agent Memory
A vector database stores memories as mathematical vectors (embeddings). When the agent needs to recall something, it converts the query to a vector and finds the most similar memories in the database.
This is called semantic search — it finds memories that are similar in meaning, not just exact keyword matches.
Popular Vector Databases
| Database | Type | Best For |
|---|---|---|
| Chroma | Open-source, local | Quick prototyping |
| Pinecone | Managed cloud | Production apps at scale |
| FAISS | Open-source (Meta) | Fast similarity search |
| Weaviate | Open-source + cloud | Full-featured semantic memory |
| Qdrant | Open-source + cloud | High-performance production use |
Implementing Basic External Memory
import json
import os
MEMORY_FILE = "user_memory.json"
def save_memory(user_id: str, fact: str):
"""Save a new memory for a user."""
memories = load_all_memories()
if user_id not in memories:
memories[user_id] = []
memories[user_id].append(fact)
with open(MEMORY_FILE, "w") as f:
json.dump(memories, f, indent=2)
def load_memories(user_id: str) -> list:
"""Retrieve all memories for a specific user."""
memories = load_all_memories()
return memories.get(user_id, [])
def load_all_memories() -> dict:
if not os.path.exists(MEMORY_FILE):
return {}
with open(MEMORY_FILE, "r") as f:
return json.load(f)
# Usage
save_memory("user_001", "Prefers vegetarian food recommendations")
save_memory("user_001", "Lives in Pune, India")
memories = load_memories("user_001")
print(memories)
# ["Prefers vegetarian food recommendations", "Lives in Pune, India"]
Injecting Memory into Agent Prompts
def build_system_prompt(user_id: str) -> str:
memories = load_memories(user_id)
memory_text = ""
if memories:
memory_text = "\n\nWhat you know about this user:\n"
for memory in memories:
memory_text += f"- {memory}\n"
return f"You are a helpful personal assistant.{memory_text}"
# When a session starts:
system_prompt = build_system_prompt("user_001")
# System prompt now contains the user's stored memories
Deciding What to Remember
Not everything a user says should be stored as memory. Here is a simple decision framework:
| Should Remember | Should NOT Remember |
|---|---|
| User preferences ("I prefer formal language") | Casual greetings ("Hi", "Thanks") |
| Important facts ("My budget is ₹50,000") | Questions already answered |
| User's profession or location | Temporary session-specific details |
| Past decisions and their outcomes | Repetitive, low-value interactions |
Memory Summary: All Four Types
| Memory Type | Persists Across Sessions | Storage | Use Case |
|---|---|---|---|
| In-Context (Short-Term) | No | LLM context window | Current conversation flow |
| External (Long-Term) | Yes | Database / File | User preferences, history |
| Episodic | Yes | Database | Learning from past events |
| Semantic | Yes | Vector Database | Knowledge base, fact retrieval |
Summary
Memory is what transforms an AI Agent from a one-time responder into a persistent, intelligent assistant. In-context memory handles current conversations, external memory preserves important facts across sessions, episodic memory records experience, and semantic memory provides a searchable knowledge base. Building the right memory system for the right use case is a key skill in agent development — and this course covers all of it in detail.
