GenAI – How It Works

Generative AI creates new content by learning patterns from large amounts of existing data. The process has three major stages: data collection and preparation, model training, and content generation. Each stage plays a specific role in making the final output possible.

Stage 1 — Collecting and Preparing Data

Every generative AI model starts with data. A text model learns from billions of sentences. An image model learns from millions of pictures. An audio model learns from thousands of hours of sound recordings.

Raw data is rarely clean. Before training begins, the data goes through a preparation process:

Duplicate content is removed
Irrelevant or harmful content is filtered out
Text is broken into consistent units called tokens
Images are resized and normalized

The quality of the data directly affects the quality of the model's output. Poor data leads to inaccurate, biased, or irrelevant generations.

Stage 2 — Training the Model

Training is the process of teaching the model to recognize and reproduce patterns. During training, the model reads through the data millions of times, adjusting its internal settings — called weights or parameters — each time it makes an error.

Think of it like learning to ride a bike. The first few attempts result in falls. Each fall teaches the body to adjust balance. Over many attempts, the adjustments become automatic and accurate. Model training works the same way, but with mathematics instead of muscles.

How the Model Learns — The Prediction Game

For text models, training works through a process called next-token prediction. The model looks at a sequence of words and tries to predict what comes next. It then compares its guess to the actual answer and adjusts its weights to do better next time.

Example:

Input:   "The sky is very ___"
Guess:   "beautiful"
Actual:  "blue"
Action:  Adjust weights to favor "blue" in this context

This adjustment process — called backpropagation — repeats billions of times across the entire training dataset. By the end, the model has learned enough patterns to generate coherent, relevant text.

Stage 3 — Generating New Content

Once trained, the model accepts a prompt and generates a response. It does not look up the answer from a database. Instead, it uses the patterns learned during training to build the output one piece at a time.

For a text model, this means generating one word (or token) at a time, where each new word depends on all the words that came before it.

Prompt: "Explain gravity in simple words"

Model generates:
"Gravity" → "is" → "a" → "force" → "that" → "pulls" → "objects" → "toward" → "each" → "other"

Each word is chosen based on the probability learned during training. The model picks the most likely next word given the entire context of what has been written so far.

Full Process Diagram

Step 1: Collect Data
─────────────────────────────────────────────────
  Billions of text documents / images / audio files
                        │
                        ▼
Step 2: Prepare Data
─────────────────────────────────────────────────
  Clean → Tokenize → Normalize → Format
                        │
                        ▼
Step 3: Train Model
─────────────────────────────────────────────────
  Model reads data → Makes predictions →
  Compares to truth → Adjusts weights → Repeats
                        │
                        ▼
Step 4: Receive Prompt
─────────────────────────────────────────────────
  "Write a short poem about rain"
                        │
                        ▼
Step 5: Generate Output
─────────────────────────────────────────────────
  Model produces: Original poem about rain

What Are Weights and Parameters?

A model's weights are its memory. They are numerical values stored inside the model that encode everything it has learned. A small model might have millions of weights. A large model like GPT-4 has hundreds of billions.

These weights are not a list of facts. They are a compressed mathematical representation of patterns found across billions of training examples. This is why a model can answer questions about topics that were never explicitly stored as facts.

Temperature — Controlling Creativity

When a model generates text, it does not always pick the most probable next word. A setting called temperature controls how creative or conservative the output is.

Temperature Setting	Behavior	Best For
Low (0.0 – 0.3)	Picks the most likely words consistently	Facts, code, structured responses
Medium (0.5 – 0.7)	Balanced between safe and creative	General writing, Q&A
High (0.9 – 1.5)	More varied and unexpected word choices	Stories, poetry, brainstorming

Key Concepts Summary

Concept	What It Means in Simple Terms
Training Data	The examples used to teach the model
Weights / Parameters	The model's internal memory of learned patterns
Backpropagation	The method of correcting errors during training
Inference	Using a trained model to generate output
Temperature	A dial that controls how random or predictable the output is

With this foundation in place, the next step is to explore the different types of generative AI models — each designed for a specific kind of content output.

Previous lesson

Back to course

Next lesson