Memory in AI Agents

By default, every time a new conversation starts with an LLM, it begins completely fresh — it remembers nothing from past sessions. For a simple question-answer tool, this is fine. But for an AI Agent that needs to learn about a user over time, track ongoing tasks, or recall past interactions, memory is absolutely essential.

Memory is what gives an agent the ability to "remember" — making it feel persistent, intelligent, and context-aware.

Why Memory Matters

Consider this conversation without memory:

Session 1:
  User:  "My name is Arjun and I prefer responses in Hindi."
  Agent: "Sure Arjun, I'll respond in Hindi from now on."

Session 2 (next day):
  User:  "Can you summarise yesterday's meeting notes?"
  Agent: "Sure! Who are you and what notes would you like me to summarise?"

The agent forgot everything. Now with memory:

Session 2 (next day):
  Agent: "Welcome back, Arjun! यहाँ कल की मीटिंग का सारांश है..."

Memory makes the agent genuinely useful over time.

The Four Types of Agent Memory

Type 1 — In-Context Memory (Short-Term)

In-Context Memory is the simplest form. It is the entire conversation history stored within the LLM's context window during a single session.

Every message (user + agent) is included in the prompt sent to the LLM. This gives the agent "short-term memory" for the current conversation.

messages = [
    {"role": "system",    "content": "You are a helpful assistant."},
    {"role": "user",      "content": "My name is Priya."},
    {"role": "assistant", "content": "Nice to meet you, Priya!"},
    {"role": "user",      "content": "What is my name?"},    # ← New message
]
# LLM can see entire history → answers "Your name is Priya"
Strengths and Limitations
StrengthsLimitations
Simple to implementLost when the session ends
Works automaticallyLimited by context window size
No storage neededVery long conversations use more tokens (higher cost)

Type 2 — External Memory (Long-Term)

External Memory stores information outside the LLM — in a database, file system, or vector store. This memory persists across sessions.

When a new session starts, relevant memories are retrieved from the database and injected into the prompt, so the agent "remembers" past context.

Architecture:
  [User says something important]
       ↓
  [Agent saves it to memory database]
       ↓
  [Next session starts]
       ↓
  [Agent retrieves relevant memories]
       ↓
  [Memories injected into system prompt]
       ↓
  [Agent responds with full context]
Example
Memory Database (stored between sessions):
{
  "user_id": "arjun_001",
  "memories": [
    "Prefers responses in Hindi",
    "Works as a software engineer",
    "Interested in machine learning",
    "Asked about Python courses on 14 Jan"
  ]
}

New session system prompt:
"You are a helpful assistant. Here is what you know about this user:
 - Prefers responses in Hindi
 - Software engineer interested in machine learning"
Strengths and Limitations
StrengthsLimitations
Persists across sessionsRequires extra storage infrastructure
Scales to large amounts of memoryNeed logic to decide what to store
Can store user preferences, historyRetrieving the right memory at the right time is complex

Type 3 — Episodic Memory

Episodic Memory stores specific events or experiences — like a diary. The agent records what happened, when it happened, and what the outcome was.

Episode stored:
{
  "date": "2024-01-15",
  "task": "Helped user debug a FastAPI authentication error",
  "approach": "Checked JWT token expiry, found 1-hour limit was too short",
  "outcome": "Resolved by increasing token expiry to 24 hours",
  "user_rating": 5
}

This type of memory is useful for agents that need to learn from past experience — like a support agent that gets better over time by remembering what solutions worked.

Type 4 — Semantic Memory (Knowledge Base)

Semantic Memory stores facts, rules, and general knowledge — not tied to any specific conversation. This is typically stored in a vector database and retrieved using semantic search.

Knowledge Base Entries:
  - "The return policy allows returns within 7 days of delivery."
  - "Premium members get free shipping on all orders."
  - "Order cancellations are only allowed within 2 hours of placing."

Agent receives question: "Can I cancel an order I placed 3 hours ago?"
Agent searches knowledge base → Finds the cancellation policy
Agent responds: "Unfortunately, order cancellations are only allowed 
                 within 2 hours of placing the order."

Vector Databases for Agent Memory

A vector database stores memories as mathematical vectors (embeddings). When the agent needs to recall something, it converts the query to a vector and finds the most similar memories in the database.

This is called semantic search — it finds memories that are similar in meaning, not just exact keyword matches.

Popular Vector Databases

DatabaseTypeBest For
ChromaOpen-source, localQuick prototyping
PineconeManaged cloudProduction apps at scale
FAISSOpen-source (Meta)Fast similarity search
WeaviateOpen-source + cloudFull-featured semantic memory
QdrantOpen-source + cloudHigh-performance production use

Implementing Basic External Memory

import json
import os

MEMORY_FILE = "user_memory.json"

def save_memory(user_id: str, fact: str):
    """Save a new memory for a user."""
    memories = load_all_memories()
    if user_id not in memories:
        memories[user_id] = []
    memories[user_id].append(fact)
    with open(MEMORY_FILE, "w") as f:
        json.dump(memories, f, indent=2)

def load_memories(user_id: str) -> list:
    """Retrieve all memories for a specific user."""
    memories = load_all_memories()
    return memories.get(user_id, [])

def load_all_memories() -> dict:
    if not os.path.exists(MEMORY_FILE):
        return {}
    with open(MEMORY_FILE, "r") as f:
        return json.load(f)

# Usage
save_memory("user_001", "Prefers vegetarian food recommendations")
save_memory("user_001", "Lives in Pune, India")

memories = load_memories("user_001")
print(memories)
# ["Prefers vegetarian food recommendations", "Lives in Pune, India"]

Injecting Memory into Agent Prompts

def build_system_prompt(user_id: str) -> str:
    memories = load_memories(user_id)

    memory_text = ""
    if memories:
        memory_text = "\n\nWhat you know about this user:\n"
        for memory in memories:
            memory_text += f"- {memory}\n"

    return f"You are a helpful personal assistant.{memory_text}"

# When a session starts:
system_prompt = build_system_prompt("user_001")
# System prompt now contains the user's stored memories

Deciding What to Remember

Not everything a user says should be stored as memory. Here is a simple decision framework:

Should RememberShould NOT Remember
User preferences ("I prefer formal language")Casual greetings ("Hi", "Thanks")
Important facts ("My budget is ₹50,000")Questions already answered
User's profession or locationTemporary session-specific details
Past decisions and their outcomesRepetitive, low-value interactions

Memory Summary: All Four Types

Memory TypePersists Across SessionsStorageUse Case
In-Context (Short-Term)NoLLM context windowCurrent conversation flow
External (Long-Term)YesDatabase / FileUser preferences, history
EpisodicYesDatabaseLearning from past events
SemanticYesVector DatabaseKnowledge base, fact retrieval

Summary

Memory is what transforms an AI Agent from a one-time responder into a persistent, intelligent assistant. In-context memory handles current conversations, external memory preserves important facts across sessions, episodic memory records experience, and semantic memory provides a searchable knowledge base. Building the right memory system for the right use case is a key skill in agent development — and this course covers all of it in detail.

Leave a Comment

Your email address will not be published. Required fields are marked *