GenAI Embeddings and Vector Databases

Embeddings and vector databases are the infrastructure layer underneath RAG systems, semantic search, and recommendation engines. Understanding them is essential for building any application where AI needs to find meaning in large collections of text, images, or data.

What Is an Embedding?

An embedding is a list of numbers — called a vector — that represents the meaning of a piece of text, an image, or any other data. Similar things produce vectors that are numerically close to each other. Different things produce vectors that are far apart.

Text → Embedding Model → Vector (list of numbers)

"Dog"       → [0.82, 0.14, 0.67, 0.33, ...]
"Puppy"     → [0.80, 0.16, 0.65, 0.31, ...]  ← close to "Dog"
"Car"       → [0.12, 0.89, 0.04, 0.77, ...]  ← far from "Dog"
"Automobile"→ [0.13, 0.88, 0.05, 0.78, ...]  ← close to "Car"

In practice, embedding vectors have hundreds or thousands of dimensions — not just 4. But the principle is the same: meaning maps to position in vector space.

Why Embeddings Enable Semantic Search

Traditional keyword search finds documents containing exact words from the query. Embedding-based semantic search finds documents with similar meaning — even if no words match exactly.

Query: "How do I fix a leaking pipe?"

Keyword search results:
  Only finds documents containing the words "fix" AND "leaking" AND "pipe"

Semantic search results:
  Finds documents about:
  - "Repairing burst plumbing"    ← different words, same meaning
  - "Stopping water drip in walls" ← no keyword match, highly relevant
  - "Plumbing repair guide"        ← relevant but no word match

How Embedding Models Work

An embedding model is a neural network trained to place similar texts close together in vector space. During training, the model learns by seeing millions of examples of similar and dissimilar texts. Popular embedding models include:

Embedding Model	Creator	Notes
text-embedding-3-large	OpenAI	Strong general-purpose, 3072 dimensions
text-embedding-ada-002	OpenAI	Fast, widely used, 1536 dimensions
all-MiniLM-L6-v2	SBERT (open-source)	Lightweight, fast, runs locally
embed-english-v3.0	Cohere	Optimized for search and retrieval tasks
mxbai-embed-large	MixedBread AI (open-source)	State-of-the-art open-source option

What Is a Vector Database?

A vector database stores embedding vectors and enables fast similarity search across millions of them. When a query arrives, the database finds the vectors closest to the query vector — measured by a distance metric — and returns the corresponding documents.

Vector Database Structure
──────────────────────────────────────────────────────────────
ID  | Text (chunk)                    | Vector
────────────────────────────────────────────────────────────
001 | "Refund policy for digital..."  | [0.81, 0.14, 0.67...]
002 | "Shipping takes 3–5 days..."    | [0.22, 0.77, 0.31...]
003 | "Cancel subscription anytime..."| [0.79, 0.18, 0.70...]
...
──────────────────────────────────────────────────────────────

Query: "How do I get a refund?"
Query vector: [0.80, 0.15, 0.68...]

Nearest vectors: 001 (refund policy) → 003 (cancel subscription)
Returns: Chunks 001 and 003 as the most relevant results
──────────────────────────────────────────────────────────────

Distance Metrics for Vector Similarity

Metric	What It Measures	When to Use
Cosine Similarity	Angle between vectors — direction of meaning	Most text similarity tasks (most common)
Euclidean Distance	Straight-line distance between vectors	Image embeddings, physical data
Dot Product	Magnitude + direction combined	When scale of the vector matters

Popular Vector Databases

Database	Type	Key Strength
Pinecone	Managed cloud	Fully managed, scales easily, production-ready
Weaviate	Open-source / cloud	Multimodal search, hybrid keyword + vector
Qdrant	Open-source / cloud	Fast, memory-efficient, rich filtering
Chroma	Open-source	Developer-friendly, great for local prototyping
pgvector	PostgreSQL extension	Add vector search to an existing Postgres database
Milvus	Open-source	High-performance, designed for billion-scale data

Embedding and Vector Database Pipeline

INDEXING (one-time setup):
────────────────────────────────────────────────────────────────
Raw Documents
     │
     ▼
Chunk into 200–500 word segments
     │
     ▼
Pass each chunk through Embedding Model → Vector
     │
     ▼
Store (chunk text + vector + metadata) in Vector Database
────────────────────────────────────────────────────────────────

QUERY (every user request):
────────────────────────────────────────────────────────────────
User Query: "How do I cancel my subscription?"
     │
     ▼
Embed the query using same Embedding Model → Query Vector
     │
     ▼
Vector DB computes similarity of query vector vs all stored vectors
     │
     ▼
Return top K most similar chunks
     │
     ▼
Inject chunks into LLM prompt → Generate grounded response
────────────────────────────────────────────────────────────────

Metadata Filtering

Vector databases support metadata filtering — searching within a specific subset of documents before applying similarity search. This improves precision significantly.

Without filtering:
  Search all 500,000 document chunks → May retrieve irrelevant old documents

With metadata filtering:
  Filter: WHERE department = "HR" AND year = 2024
  Then search: Top 5 most similar chunks within those filtered results
  → Precise, fast, and relevant

Embeddings Beyond Text

Embeddings work for any type of data, not just text:

Image embeddings: CLIP converts images into vectors — enables searching images with text descriptions
Audio embeddings: Sound clips converted to vectors for music similarity and audio search
Code embeddings: Represents code semantics for code search and duplicate detection
Multimodal embeddings: Combines text and image into a shared vector space

With embeddings and vector databases understood, the next major topic explores the systems that combine retrieval, LLMs, and tools into autonomous agents — AI systems that can plan and take actions in the world without step-by-step human instruction.

Previous lesson

Back to course

Next lesson