GenAI AI Agents and Autonomous Systems

A basic LLM answers one question at a time. An AI agent uses an LLM as its brain and connects it to tools, memory, and a planning loop — allowing it to break down complex goals, take actions, observe results, and keep working until the task is complete. Agents represent the frontier of practical generative AI.

What Is an AI Agent?

An AI agent is a system that perceives its environment, decides what action to take, executes that action using tools, and repeats the cycle until it reaches a goal. Unlike a standard prompt-response interaction, an agent can take many steps, use many tools, and self-correct when things go wrong.

Simple LLM Interaction:
────────────────────────────────────────────────
Human: "Research the latest iPhone specs."
LLM:   "I don't have real-time internet access..."
────────────────────────────────────────────────

AI Agent with Web Search Tool:
────────────────────────────────────────────────
Human: "Research the latest iPhone specs."
Agent thinks: "I need to search the web."
Agent uses:   [web_search("latest iPhone specs 2025")]
Agent reads:  Search results returned
Agent writes: "The iPhone 16 Pro features a 48MP camera, A18 Pro chip..."
────────────────────────────────────────────────

The Four Core Components of an AI Agent

Component	Role	Analogy
LLM (Brain)	Reasons, plans, and decides what to do	The thinking mind
Tools	Actions the agent can take (search, code, write, call API)	Hands and instruments
Memory	Stores context, past steps, and observations	Notepad and long-term memory
Planning Loop	The cycle of: think, act, observe, repeat	The work process

The ReAct Loop — How Agents Think and Act

The most widely used agent pattern is called ReAct (Reasoning + Acting). The agent alternates between reasoning about the situation and taking an action.

Task: "Find the current CEO of Apple and write a one-paragraph bio."

THOUGHT 1: "I need to find the current CEO of Apple."
ACTION 1:  web_search("Apple CEO 2025")
OBSERVATION 1: "Tim Cook has been CEO of Apple since 2011..."

THOUGHT 2: "Now I have the name. I need more biographical detail."
ACTION 2:  web_search("Tim Cook biography early life career")
OBSERVATION 2: "Tim Cook was born in Robertsdale, Alabama in 1960..."

THOUGHT 3: "I have enough information to write the bio."
ACTION 3:  write_text("Tim Cook is the CEO of Apple Inc...")
FINAL ANSWER: [One-paragraph bio of Tim Cook]

Common Tools Given to AI Agents

Tool	What It Does
Web search	Searches the internet for current information
Code interpreter	Writes and runs Python code, returns output
File reader/writer	Opens, reads, and writes files on disk
Database query	Queries SQL or NoSQL databases
API caller	Makes HTTP requests to external services
Email and calendar	Reads and sends emails, books meetings
Browser automation	Navigates websites, fills forms, clicks buttons
Vector search	Retrieves relevant documents from a knowledge base

Types of Agent Memory

Memory Type	What It Stores	Duration
In-context	Current task steps and observations	Current session only
External database	Past conversations, user preferences, facts	Persistent across sessions
Episodic	Record of past agent actions and outcomes	Long-term, retrievable
Semantic (RAG)	General knowledge via vector store	Persistent, searchable

Multi-Agent Systems

Complex tasks split across multiple specialized agents, each handling one part of the workflow and passing results to the next.

Task: "Produce a competitive analysis report on three companies."

ORCHESTRATOR AGENT: Plans workflow, assigns tasks
        |
        |--- RESEARCH AGENT A: Collects data on Company 1
        |--- RESEARCH AGENT B: Collects data on Company 2
        |--- RESEARCH AGENT C: Collects data on Company 3
                |
                v
        SYNTHESIS AGENT: Combines all research
                |
                v
        WRITER AGENT: Produces the final report

Popular Agent Frameworks

LangChain Agents: Flexible tool-using agents with ReAct loop support
LangGraph: Graph-based agent workflows with stateful, looping architectures
AutoGen (Microsoft): Multi-agent conversation framework for complex tasks
CrewAI: Role-based multi-agent system with collaborative task assignment
OpenAI Assistants API: Managed agent runtime with built-in tools
Anthropic Claude tool use: Native function-calling for building custom agents

Agentic Challenges

Challenge	Description
Hallucinated tool calls	Agent invents arguments for tools that do not work
Infinite loops	Agent repeats the same action without making progress
Error cascades	A mistake in step 2 causes all following steps to fail
Cost accumulation	Many LLM calls across long tasks become expensive
Safety and authorization	Agent may take unintended actions if not bounded properly

Human-in-the-Loop Design

For high-stakes tasks — such as sending emails, deleting files, or making purchases — agents pause and request human approval before executing irreversible actions. This design pattern keeps humans in control of consequential decisions while the agent handles the research and preparation work automatically.

Agent reaches a sensitive action:
  Agent: "I am about to send this email to 500 customers.
          Please review and approve before I proceed."
  Human: Approves or edits
  Agent: Continues with confirmed action

Real-World Agent Applications

Application	What the Agent Does
Software development	Reads codebase, writes new features, runs tests, fixes failures
Research assistant	Searches web, reads papers, synthesizes findings into a report
Data analysis	Loads data, writes analysis code, runs it, interprets results
Customer support	Checks order status via API, processes refunds, escalates complex cases
Personal assistant	Books meetings, drafts emails, summarizes daily news

AI agents extend generative AI from answering questions to completing real work. Before deploying any generative AI system — agent or otherwise — it is essential to measure how well it performs. The next topic covers evaluation and benchmarking methods for generative AI.

Previous lesson

Back to course

Next lesson