How RAG Works
RAG follows a repeatable sequence of steps every time a question comes in. Learning this sequence makes every later topic in this course much easier to follow, since almost every advanced idea simply improves one of these five steps.
The Five-Step Sequence
| Step | Action | Plain Explanation |
|---|---|---|
| 1 | Ingest documents | Load files into a searchable store, like stocking a library shelf |
| 2 | Break into chunks | Cut long documents into small sections, like tearing a book into single pages |
| 3 | Convert to embeddings | Turn each chunk into a set of numbers that captures its meaning |
| 4 | Search on the question | Find chunks whose numbers are closest in meaning to the question |
| 5 | Generate the answer | Send the question and matching chunks to the model for a final response |
The Full RAG Loop
A Restaurant Kitchen Analogy
Picture a kitchen with pre-chopped ingredients stored in labeled containers. A customer orders a dish. The chef does not run to the market; the chef grabs the containers already labeled for that dish, then cooks. Document ingestion and chunking are the chopping and labeling. The search step is the chef grabbing the right containers. Generation is the actual cooking that turns raw ingredients into a finished meal.
Step-by-Step Walkthrough With a Real Question
A user types: "What is the warranty period for the X200 blender?"
- The system already holds the product manual, chunked into small sections.
- It searches those chunks for the ones closest in meaning to the question.
- It finds the warranty section and pulls it out.
- It sends the question plus the warranty section to the model.
- The model replies with the exact warranty period, based on that section.
Why Order Matters
Skipping the search step forces the model to guess. Searching too broadly buries the right answer under irrelevant text. Each step exists to keep the final answer both accurate and focused, and removing any single step weakens the whole chain.
What Happens When a Step Fails
| Failed Step | Result |
|---|---|
| Poor chunking | Search returns broken or incomplete context |
| Weak embeddings | Search matches unrelated topics by mistake |
| Sloppy search logic | Right chunk exists but never gets found |
| Careless generation prompt | Model ignores the good context it was given |
Where Each Step Gets Its Own Topic
| Step | Covered In |
|---|---|
| Search using numeric meaning | Vector Databases Explained |
| Turning text into numbers | Embeddings Explained |
| Cutting documents into pieces | Chunking Strategies |
