PostgreSQL Common Table Expressions

A Common Table Expression (CTE) is a named, temporary result set defined at the beginning of a query. It behaves like a subquery but is written separately from the main query, making complex SQL much easier to read and maintain. CTEs are introduced with the WITH keyword.

Why Use CTEs?

As queries grow in complexity, nesting subqueries inside each other produces code that is difficult to follow. A CTE solves this by letting each logical step be named and written once, then referenced by name in the main query or even in other CTEs.

Consider the difference:

Without CTE (nested subqueries)

SELECT department, avg_salary
FROM (
    SELECT department, AVG(salary) AS avg_salary
    FROM employees
    GROUP BY department
) AS dept_avg
WHERE avg_salary > 60000;

With CTE (readable and named)

WITH dept_avg AS (
    SELECT department, AVG(salary) AS avg_salary
    FROM employees
    GROUP BY department
)
SELECT department, avg_salary
FROM dept_avg
WHERE avg_salary > 60000;

Both produce identical results. The CTE version names the intermediate step, making the logic self-documenting.

CTE Syntax

WITH cte_name AS (
    SELECT ...
)
SELECT ... FROM cte_name;

The CTE is defined inside the WITH block. The main query follows and can reference the CTE by its name just like a table.

Multiple CTEs

Multiple CTEs can be chained together, each separated by a comma. Later CTEs can reference earlier ones.

WITH
active_employees AS (
    SELECT id, name, department, salary
    FROM employees
    WHERE status = 'active'
),
dept_stats AS (
    SELECT department,
           COUNT(*) AS headcount,
           ROUND(AVG(salary), 2) AS avg_salary
    FROM active_employees
    GROUP BY department
)
SELECT *
FROM dept_stats
WHERE headcount >= 3
ORDER BY avg_salary DESC;

dept_stats references active_employees — which was defined in the same WITH clause. This chaining makes multi-step logic clean and sequential.

CTEs with INSERT, UPDATE, and DELETE

CTEs are not limited to SELECT. They can be used with data-modification statements:

WITH old_orders AS (
    DELETE FROM orders
    WHERE order_date < '2023-01-01'
    RETURNING *
)
INSERT INTO archived_orders
SELECT * FROM old_orders;

This deletes old orders and simultaneously inserts them into an archive table in a single atomic operation. The CTE captures the deleted rows via RETURNING and passes them to the INSERT.

Recursive CTEs

A recursive CTE calls itself. It is the standard way to traverse hierarchical or tree-structured data — such as organizational charts, category trees, or folder structures.

Structure of a Recursive CTE

WITH RECURSIVE cte_name AS (
    -- Base case: starting point
    SELECT ...

    UNION ALL

    -- Recursive case: references cte_name
    SELECT ... FROM source_table
    INNER JOIN cte_name ON ...
)
SELECT * FROM cte_name;

The base case runs once and provides the starting rows. The recursive case joins back to the CTE itself, adding more rows each iteration. The process continues until no new rows are produced.

Example: Organizational Hierarchy

CREATE TABLE staff (
    id INT PRIMARY KEY,
    name VARCHAR(100),
    manager_id INT REFERENCES staff(id)
);

INSERT INTO staff VALUES
    (1, 'CEO', NULL),
    (2, 'VP Engineering', 1),
    (3, 'VP Marketing', 1),
    (4, 'Engineering Manager', 2),
    (5, 'Developer A', 4),
    (6, 'Developer B', 4),
    (7, 'Marketing Lead', 3);

WITH RECURSIVE org_chart AS (
    -- Base case: top-level employee (no manager)
    SELECT id, name, manager_id, 0 AS level
    FROM staff
    WHERE manager_id IS NULL

    UNION ALL

    -- Recursive case: find direct reports
    SELECT s.id, s.name, s.manager_id, oc.level + 1
    FROM staff s
    INNER JOIN org_chart oc ON s.manager_id = oc.id
)
SELECT level, name
FROM org_chart
ORDER BY level, name;

Result shows the entire hierarchy with a level column indicating depth: 0 = CEO, 1 = VPs, 2 = Managers, 3 = Developers/Leads.

Example: Generating a Number Series

WITH RECURSIVE counter AS (
    SELECT 1 AS n

    UNION ALL

    SELECT n + 1 FROM counter WHERE n < 10
)
SELECT n FROM counter;

This generates numbers 1 through 10 using a recursive CTE. Note: PostgreSQL also has the built-in generate_series(1, 10) function that does the same thing more efficiently — but this illustrates the recursive mechanism clearly.

CTE Materialization

In PostgreSQL 12 and later, CTEs are not materialized by default — the query planner may inline them (treat them like subqueries) for better optimization. To force materialization (which evaluates the CTE once and caches the result):

WITH expensive_query AS MATERIALIZED (
    SELECT ... -- complex query run only once
)
SELECT * FROM expensive_query
UNION ALL
SELECT * FROM expensive_query WHERE condition;

Before PostgreSQL 12, all CTEs were automatically materialized. Use NOT MATERIALIZED to explicitly opt out if needed.

Key Points

  • A CTE is a named, temporary result set defined with the WITH keyword before the main query.
  • CTEs make complex queries readable by breaking them into named logical steps.
  • Multiple CTEs can be defined in one WITH block, separated by commas; later CTEs can reference earlier ones.
  • CTEs can be used with INSERT, UPDATE, and DELETE — not just SELECT.
  • Recursive CTEs use UNION ALL to reference themselves and are essential for traversing hierarchical data.
  • In PostgreSQL 12+, CTEs are not materialized by default; use the MATERIALIZED keyword to force caching of the result.

Leave a Comment

Your email address will not be published. Required fields are marked *