PostgreSQL Indexes

An index is a data structure that speeds up the retrieval of rows from a table. Without an index, PostgreSQL must scan every row in a table to find matching results — called a sequential scan. With an index on the searched column, PostgreSQL can locate matching rows much faster, similar to how a book index lets a reader jump directly to a topic instead of reading every page.

How Indexes Work

Think of a phone book. To find "Smith, John", reading the entire book from page one would be impractical. Instead, the alphabetical ordering acts as an index — jump straight to the "S" section. Database indexes work the same way: they store a sorted, searchable reference to column values and the physical location of their corresponding rows.

The most common index type in PostgreSQL is the B-tree (balanced tree) index, which is the default. It is efficient for equality checks (=), range queries (<, >, BETWEEN), and sorting.

Creating an Index

Basic Syntax

CREATE INDEX index_name ON table_name (column_name);

Example

CREATE INDEX idx_employees_department
ON employees (department);

Queries that filter or sort by the department column will now use this index, avoiding a full table scan.

Viewing Existing Indexes

-- In psql, use this meta-command
\d employees

The output shows the table's columns and any indexes defined on it. Indexes are also visible in pgAdmin under the table's "Indexes" node.

Types of Indexes in PostgreSQL

B-tree Index (Default)

B-tree is the default and most versatile index type. It works for equality, range, and sorting operations.

CREATE INDEX idx_salary ON employees (salary);

Unique Index

A unique index enforces that all values in the indexed column are distinct. PostgreSQL automatically creates a unique index when a UNIQUE constraint or PRIMARY KEY is defined.

CREATE UNIQUE INDEX idx_unique_email ON users (email);

Attempts to insert a duplicate email will now fail.

Partial Index

A partial index only indexes rows that satisfy a specified condition. This makes the index smaller and faster to maintain when only a subset of rows is frequently queried.

CREATE INDEX idx_active_employees
ON employees (department)
WHERE status = 'active';

Only active employees are indexed. Queries filtering by status = 'active' and department will benefit from this smaller, focused index.

Composite Index (Multi-Column Index)

An index can cover multiple columns. Queries that filter on all listed columns (or just the leading columns) will use this index.

CREATE INDEX idx_dept_salary
ON employees (department, salary);

This index helps queries that filter by department alone, or by both department and salary together. It does not help queries that filter only by salary.

GIN Index (Generalized Inverted Index)

GIN indexes are used for full-text search and JSONB columns, where a single value can contain multiple searchable elements.

CREATE INDEX idx_tags ON articles USING GIN (tags);

Hash Index

Hash indexes are optimized specifically for equality comparisons (=) but not for range queries. They are rarely preferred over B-tree in practice.

CREATE INDEX idx_hash_email ON users USING HASH (email);

Dropping an Index

DROP INDEX idx_employees_department;

Dropping an index does not affect the data — only the performance structure is removed. The table and its rows remain unchanged.

When Does PostgreSQL Use an Index?

PostgreSQL uses a query planner to decide whether using an index is faster than a sequential scan. For small tables, a full scan may actually be faster. The planner chooses automatically based on table statistics.

To check whether a query uses an index, use EXPLAIN:

EXPLAIN SELECT * FROM employees WHERE department = 'IT';

The output shows the query plan. Look for Index Scan to confirm index usage, or Seq Scan if a full table scan is used instead.

The Trade-offs of Indexes

Indexes improve read performance but come with costs:

Benefit	Cost
Faster SELECT queries	Extra disk space
Faster ORDER BY and JOIN	Slower INSERT, UPDATE, DELETE (index must be maintained)
Enforces uniqueness	Index maintenance adds overhead to write operations

For tables that are read far more often than written to, indexes are almost always beneficial on frequently queried columns.

Best Practices for Indexing

Index Columns Used in WHERE Clauses

Columns that appear frequently in WHERE conditions are strong candidates for indexing:

-- Frequently run query
SELECT * FROM orders WHERE customer_id = 42;

-- Create an index to speed it up
CREATE INDEX idx_orders_customer ON orders (customer_id);

Index Foreign Key Columns

PostgreSQL does not automatically create indexes on foreign key columns. Adding them improves JOIN performance significantly:

CREATE INDEX idx_orders_fk_customer ON orders (customer_id);

Avoid Over-Indexing

Do not add an index on every column. Each index slows down writes. Only index columns that are genuinely queried often.

Example: Measuring the Effect of an Index

-- Without index
EXPLAIN SELECT * FROM employees WHERE email = 'alice@example.com';
-- Shows: Seq Scan on employees

-- Create index
CREATE INDEX idx_email ON employees (email);

-- With index
EXPLAIN SELECT * FROM employees WHERE email = 'alice@example.com';
-- Shows: Index Scan using idx_email on employees

Key Points

Indexes speed up data retrieval by avoiding full table scans.
The default index type is B-tree, which supports equality, range, and sort operations.
Unique indexes enforce that no two rows have the same value in the indexed column.
Partial indexes cover only rows matching a condition, keeping the index smaller.
Composite indexes cover multiple columns and help queries filtering on the leading columns.
Indexes improve read performance but add overhead to write operations.
Use EXPLAIN to verify whether a query is using an index.

Previous lessons

Back to courses

Next lessons