Understanding Large Language Models (LLMs)

An AI Agent needs a brain — and that brain is a Large Language Model (LLM). LLMs are the core technology behind modern AI Agents. Without understanding what an LLM is and how it works, building effective agents would be like driving a car without knowing how the engine works.

What is a Large Language Model?

A Large Language Model is an AI system that has been trained on massive amounts of text data — books, websites, articles, code, and more — to understand and generate human language.

The word "large" refers to two things:

Large data: Trained on hundreds of billions of words from the internet and books
Large parameters: Contains billions of internal numerical values that store "learned knowledge"

Simple Analogy

Think of an LLM like a student who has read millions of books and can now answer questions, write essays, summarise documents, write code, and have conversations — because of everything read during training.

How Does an LLM Work? (Simplified)

An LLM works by predicting the most likely next word (or token) given everything that has been written so far. It does this billions of times, very quickly, to generate a full response.

Step-by-Step: How an LLM Generates Text

Input (Prompt): "The capital of France is"

LLM Thinks:
  → What word most likely comes after "The capital of France is"?
  → Based on training data: "Paris" (very high probability)

Output: "Paris"

This is repeated token by token until a complete, coherent response is built:

"The capital of France is Paris, one of the most visited cities
 in the world, known for the Eiffel Tower and the Louvre Museum."

Key LLMs Used in AI Agents

LLM Name	Created By	Popular Use
GPT-4 / GPT-4o	OpenAI	Most widely used in agent development
Claude 3	Anthropic	Strong reasoning, long context
Gemini 1.5	Google DeepMind	Multimodal (text + image + video)
LLaMA 3	Meta	Open-source, run locally
Mistral	Mistral AI	Fast, lightweight, open-source

Tokens — The Language of LLMs

LLMs do not read words — they read tokens. A token is a small chunk of text, usually a word or part of a word.

Token Examples

Text	Approximate Tokens
"Hello"	1 token
"Hello world"	2 tokens
"Artificial Intelligence"	3 tokens
1 page of text (500 words)	≈ 375 tokens
1 book (80,000 words)	≈ 60,000 tokens

This matters because LLMs have a context window — the maximum number of tokens they can process at once. GPT-4 supports up to 128,000 tokens; Claude 3 supports up to 200,000 tokens.

The Context Window

The context window is like an LLM's working memory — everything it can "see" and use when generating a response. This includes:

The system instructions (what role the AI plays)
The entire conversation history
Any tool results fed back to the LLM
Documents or data provided as input

When building AI Agents, managing the context window carefully is crucial — running out of context means the agent loses earlier parts of the conversation.

What Can an LLM Do?

LLMs are remarkably capable and serve as the core reasoning engine of agents. They can:

Capability	Example
Understand natural language	Interpret ambiguous, complex questions
Reason step-by-step	Solve a maths problem by thinking aloud
Write and fix code	Generate Python code from a description
Summarise long content	Condense a 50-page PDF into 5 bullet points
Translate languages	Convert English to Hindi, French, etc.
Decide which tool to call	Choose between search, calculator, or database
Format structured output	Return a JSON object with specific fields

What LLMs Cannot Do (by Themselves)

Understanding the limitations of LLMs helps explain why agents are built around them with additional tools:

Cannot access the internet — LLMs only know what was in their training data
Training cutoff — Knowledge stops at a certain date (e.g., GPT-4 has a training cutoff)
Cannot run code — Unless given a code execution tool
Cannot remember past conversations — Every new conversation starts fresh (unless memory is added)
Sometimes "hallucinate" — Can generate plausible-sounding but incorrect information

This is exactly why AI Agents add tools, memory, and external data sources on top of the LLM.

How AI Agents Use LLMs

In an AI Agent, the LLM is called multiple times during a single task. Each time, it is given a prompt that includes:

1. A system message: "You are a helpful assistant with access to web search."

2. The conversation history:
   User: "What's the latest news about AI in India?"

3. Available tools:
   - web_search(query)
   - summarise_text(text)

4. Instructions on how to respond:
   "Think step by step. If you need information, call a tool first."

The LLM then responds with either a tool call or a final answer — and the agent framework handles the rest.

LLM Parameters That Matter for Agents

Temperature

Controls how creative or predictable the LLM's responses are.

Temperature	Behaviour	Best For
0.0	Very deterministic, same answer every time	Data extraction, code generation
0.5	Balanced — thoughtful but slightly varied	General agent reasoning
1.0	More creative, varied responses	Creative writing, brainstorming

Max Tokens

The maximum length of the LLM's response. For agents that need to explain their reasoning and call tools, setting this high enough is important.

Model Choice

Different models have different strengths. For agents that need heavy reasoning and tool use, GPT-4o or Claude 3.5 Sonnet are currently the top choices.

Calling an LLM in Python (Basic Example)

import openai

client = openai.OpenAI(api_key="your-api-key")

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user",   "content": "What is machine learning?"}
    ],
    temperature=0.3,
    max_tokens=500
)

print(response.choices[0].message.content)

This simple call is the foundation of every AI Agent — the agent just calls this function repeatedly with updated context until the task is done.

Summary

A Large Language Model is the reasoning brain of an AI Agent. It processes text as tokens, generates responses by predicting the most likely next tokens, and can understand language, reason, write code, and decide which tools to use. While powerful, LLMs have limitations — they have no internet access, no real-time knowledge, and no memory — which is why agents extend them with tools, memory systems, and external data. Understanding LLMs is the foundation for building intelligent agents.

Previous lessons

Back to courses

Next lessons