PostgreSQL GROUP BY and HAVING

Aggregate functions like COUNT and SUM produce one result for the entire table. GROUP BY changes this by dividing rows into groups and applying the aggregate function to each group separately. The HAVING clause then filters those groups — the same way WHERE filters individual rows.

Understanding GROUP BY

Imagine a table with 100 employees spread across 5 departments. Without GROUP BY, COUNT(*) returns the total of all 100 employees. With GROUP BY on the department column, COUNT returns the employee count for each department separately — five rows, one per department.

Basic Syntax

SELECT column, AGGREGATE_FUNCTION(other_column)
FROM table_name
GROUP BY column;

Every column in the SELECT list that is not inside an aggregate function must appear in the GROUP BY clause. This is a firm rule in PostgreSQL.

GROUP BY with COUNT

SELECT department, COUNT(*) AS employee_count
FROM employees
GROUP BY department;

This splits employees into groups by department and counts the rows in each group. The result will have one row per unique department value.

GROUP BY with SUM and AVG

SELECT department,
       SUM(salary) AS total_salary,
       ROUND(AVG(salary), 2) AS average_salary
FROM employees
GROUP BY department;

For each department, this calculates the total salary spend and the average salary.

GROUP BY with Multiple Columns

Grouping can be done on more than one column. Each unique combination of grouped columns forms its own group:

SELECT department, job_title, COUNT(*) AS headcount
FROM employees
GROUP BY department, job_title
ORDER BY department, job_title;

This produces one row for each unique department-and-job_title pair.

GROUP BY with WHERE

WHERE is applied before grouping, so it filters individual rows before the groups are formed:

SELECT department, AVG(salary) AS avg_salary
FROM employees
WHERE status = 'active'
GROUP BY department;

Only active employees are considered. Inactive employees are excluded before the groups are created.

The HAVING Clause

HAVING filters groups after aggregation, whereas WHERE filters rows before aggregation. A comparison: WHERE says "only include rows where salary > 50000", while HAVING says "only include groups where the average salary > 50000".

Basic Syntax

SELECT column, AGGREGATE_FUNCTION(other_column)
FROM table_name
GROUP BY column
HAVING AGGREGATE_FUNCTION(other_column) condition;

Example: Departments with More Than 2 Employees

SELECT department, COUNT(*) AS employee_count
FROM employees
GROUP BY department
HAVING COUNT(*) > 2;

Groups (departments) with 2 or fewer employees are excluded from the result. Only departments with more than 2 employees are shown.

Example: Departments Where Average Salary Exceeds 60,000

SELECT department, ROUND(AVG(salary), 2) AS avg_salary
FROM employees
GROUP BY department
HAVING AVG(salary) > 60000
ORDER BY avg_salary DESC;

WHERE vs HAVING — Side by Side

Feature	WHERE	HAVING
Filters	Individual rows	Groups of rows
Applied	Before grouping	After grouping
Can use aggregates	No	Yes
Required with GROUP BY	No	No

Using Both WHERE and HAVING Together

SELECT department, COUNT(*) AS active_count
FROM employees
WHERE status = 'active'
GROUP BY department
HAVING COUNT(*) >= 3
ORDER BY active_count DESC;

Step-by-step execution:

WHERE filters to include only active employees.
GROUP BY groups the remaining rows by department.
COUNT counts employees in each department group.
HAVING keeps only departments with 3 or more active employees.
ORDER BY sorts results by count, highest first.

Example: Full GROUP BY and HAVING Workflow

CREATE TABLE sales (
    id SERIAL PRIMARY KEY,
    salesperson VARCHAR(100),
    region VARCHAR(50),
    amount NUMERIC(10, 2),
    sale_date DATE
);

INSERT INTO sales (salesperson, region, amount, sale_date)
VALUES
    ('Alice', 'North', 1500.00, '2024-01-10'),
    ('Bob',   'South', 2200.00, '2024-01-12'),
    ('Alice', 'North', 1800.00, '2024-01-15'),
    ('Carol', 'East',  900.00,  '2024-01-18'),
    ('Bob',   'South', 3100.00, '2024-01-20'),
    ('Carol', 'East',  1200.00, '2024-01-22'),
    ('Alice', 'North', 2500.00, '2024-01-25'),
    ('Dan',   'West',  500.00,  '2024-01-28');

-- Total sales per salesperson
SELECT salesperson, SUM(amount) AS total_sales
FROM sales
GROUP BY salesperson
ORDER BY total_sales DESC;

-- Regions with more than one sale
SELECT region, COUNT(*) AS number_of_sales
FROM sales
GROUP BY region
HAVING COUNT(*) > 1;

-- Salespersons with total sales above 4000
SELECT salesperson, SUM(amount) AS total
FROM sales
GROUP BY salesperson
HAVING SUM(amount) > 4000;

-- Average sale per region, only for regions with avg sale above 1000
SELECT region, ROUND(AVG(amount), 2) AS avg_sale
FROM sales
GROUP BY region
HAVING AVG(amount) > 1000
ORDER BY avg_sale DESC;

Key Points

GROUP BY divides rows into groups and applies aggregate functions to each group.
Every non-aggregate column in SELECT must appear in GROUP BY.
WHERE filters rows before grouping; HAVING filters groups after aggregation.
Aggregate functions cannot be used in a WHERE clause — use HAVING instead.
WHERE and HAVING can be used together in the same query.
The execution order is: WHERE → GROUP BY → HAVING → ORDER BY.

Previous lessons

Back to courses

Next lessons

PostgreSQL GROUP BY and HAVING

Understanding GROUP BY

Basic Syntax

GROUP BY with COUNT

GROUP BY with SUM and AVG

GROUP BY with Multiple Columns

GROUP BY with WHERE

The HAVING Clause

Basic Syntax

Example: Departments with More Than 2 Employees

Example: Departments Where Average Salary Exceeds 60,000

WHERE vs HAVING — Side by Side

Using Both WHERE and HAVING Together

Example: Full GROUP BY and HAVING Workflow

Key Points

Leave a Comment Cancel reply