PostgreSQL Subqueries
A subquery is a query nested inside another query. The inner query runs first, and its result is used by the outer query. Subqueries make it possible to use the result of one query as a filter, a value, or a source of rows for another query — all in a single statement.
Where Subqueries Can Appear
Subqueries can be placed in three locations within an outer query:
- In the WHERE clause — to filter rows based on a computed result
- In the FROM clause — to use the subquery result as a temporary table
- In the SELECT list — to compute a value for each row
Subquery in the WHERE Clause
This is the most common use of subqueries. The inner query produces a value or set of values that the outer WHERE condition uses.
Subquery Returning a Single Value
SELECT name, salary
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);The inner query calculates the average salary across all employees. The outer query then returns employees who earn more than that average. The inner query runs once and produces a single number.
Subquery Returning Multiple Values with IN
SELECT name FROM customers
WHERE id IN (
SELECT customer_id FROM orders WHERE amount > 1000
);The inner query returns a list of customer IDs who have placed orders over 1000. The outer query finds the names of those customers. This achieves the same result as a JOIN but is often easier to read for straightforward filtering.
Subquery with NOT IN
SELECT name FROM customers
WHERE id NOT IN (
SELECT customer_id FROM orders
);This returns customers who have never placed any order. Caution: if the subquery returns any NULL values, NOT IN may produce unexpected results. Using NOT EXISTS (covered below) is safer in such cases.
Subquery in the FROM Clause (Derived Table)
A subquery in the FROM clause acts as a temporary table, often called a derived table. The result of the subquery is given an alias and treated like a regular table.
SELECT dept_summary.department, dept_summary.avg_salary
FROM (
SELECT department, ROUND(AVG(salary), 2) AS avg_salary
FROM employees
GROUP BY department
) AS dept_summary
WHERE dept_summary.avg_salary > 60000;The inner query groups employees and calculates average salary per department. The outer query then filters that summary to show only high-paying departments. The alias dept_summary is required — derived tables must always be given a name.
Subquery in the SELECT List (Scalar Subquery)
A subquery in the SELECT list must return exactly one row and one column. It runs once for each row in the outer query.
SELECT
name,
salary,
(SELECT ROUND(AVG(salary), 2) FROM employees) AS company_avg
FROM employees;Each row in the result shows the employee's salary alongside the company-wide average, making it easy to compare. The subquery runs once and returns the same value for every row.
Correlated Subqueries
A correlated subquery references a column from the outer query. It runs once for each row processed by the outer query — unlike a regular subquery that runs only once.
SELECT name, salary, department
FROM employees e1
WHERE salary > (
SELECT AVG(salary)
FROM employees e2
WHERE e2.department = e1.department
);For each employee (e1), the inner query calculates the average salary for that employee's specific department (e2.department = e1.department). The employee is included only if their salary exceeds their own department's average. This is a powerful pattern but can be slow on large tables because the subquery re-runs for every row.
EXISTS and NOT EXISTS
EXISTS checks whether a subquery returns any rows at all. It does not care about the values returned — only whether at least one row exists.
SELECT name FROM customers c
WHERE EXISTS (
SELECT 1 FROM orders o WHERE o.customer_id = c.id
);For each customer, the inner query checks if any order exists for that customer. SELECT 1 is used because the actual value does not matter — only the existence of a row matters. Customers with at least one order are included.
NOT EXISTS
SELECT name FROM customers c
WHERE NOT EXISTS (
SELECT 1 FROM orders o WHERE o.customer_id = c.id
);This returns customers who have placed no orders. Unlike NOT IN, NOT EXISTS handles NULL values correctly and is generally preferred for this type of check.
Subquery vs JOIN — When to Use Which
| Scenario | Preferred Approach |
|---|---|
| Filtering based on aggregated results | Subquery in WHERE |
| Need columns from both tables | JOIN |
| Checking existence without needing values | EXISTS / NOT EXISTS |
| Reusing an aggregated result as a table | Subquery in FROM |
| Adding a single computed value per row | Scalar subquery in SELECT |
Example: Subquery Workflow
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(100),
category VARCHAR(50),
price NUMERIC(8,2)
);
INSERT INTO products (name, category, price) VALUES
('Laptop', 'Electronics', 1200.00),
('Phone', 'Electronics', 800.00),
('Tablet', 'Electronics', 600.00),
('Desk', 'Furniture', 350.00),
('Chair', 'Furniture', 150.00),
('Pen', 'Stationery', 1.50),
('Notebook', 'Stationery', 5.00);
-- Products priced above the overall average
SELECT name, price
FROM products
WHERE price > (SELECT AVG(price) FROM products);
-- Products in categories that have more than 2 items
SELECT name, category
FROM products
WHERE category IN (
SELECT category FROM products
GROUP BY category
HAVING COUNT(*) > 2
);
-- Each product with its category's average price
SELECT name, price,
(SELECT ROUND(AVG(price), 2)
FROM products p2
WHERE p2.category = p1.category) AS category_avg
FROM products p1;Key Points
- A subquery is a query nested inside another query, enclosed in parentheses.
- Subqueries in WHERE filter rows based on computed values or lists.
- Subqueries in FROM create temporary derived tables that must be given an alias.
- Scalar subqueries in SELECT return one value per row.
- Correlated subqueries reference the outer query's columns and re-run for each outer row.
- EXISTS and NOT EXISTS check for the presence or absence of matching rows and handle NULL safely.
