SQL GROUP BY Clause
The GROUP BY clause is used to arrange rows that share a common value in a specific column into groups. It is almost always used together with aggregate functions like COUNT(), SUM(), AVG(), MIN(), and MAX() to produce a summary for each group.
Think of it like organizing a class roster by city and then counting how many students are from each city. Each city becomes a group, and the count is the aggregated result for that group.
The Reference Table
| StudentID | StudentName | Age | City | Course | Fees |
|---|---|---|---|---|---|
| 1 | Ravi Sharma | 20 | Delhi | Science | 45000 |
| 2 | Priya Mehta | 22 | Mumbai | Commerce | 52000 |
| 3 | Arjun Nair | 19 | Chennai | Science | 38000 |
| 4 | Sneha Kapoor | 21 | Delhi | Arts | 42000 |
| 5 | Rohit Das | 23 | Kolkata | Commerce | 60000 |
| 6 | Nisha Verma | 20 | Mumbai | Science | 48000 |
| 7 | Karan Joshi | 22 | Delhi | Science | 47000 |
Syntax
SELECT column_name, aggregate_function(column)
FROM table_name
WHERE condition
GROUP BY column_name
ORDER BY column_name;Example 1: Count Students Per City
SELECT City, COUNT(*) AS TotalStudents
FROM Students
GROUP BY City;Result:
| City | TotalStudents |
|---|---|
| Delhi | 3 |
| Mumbai | 2 |
| Chennai | 1 |
| Kolkata | 1 |
The table is grouped by City. All Delhi students are counted as a group (3), Mumbai as a group (2), and so on.
Example 2: Total Fees Collected Per City
SELECT City, SUM(Fees) AS TotalFees
FROM Students
GROUP BY City;Result:
| City | TotalFees |
|---|---|
| Delhi | 134000 |
| Mumbai | 100000 |
| Chennai | 38000 |
| Kolkata | 60000 |
Example 3: Average Age Per Course
SELECT Course, AVG(Age) AS AverageAge, COUNT(*) AS StudentCount
FROM Students
GROUP BY Course;Result:
| Course | AverageAge | StudentCount |
|---|---|---|
| Science | 20.25 | 4 |
| Commerce | 22.50 | 2 |
| Arts | 21.00 | 1 |
Example 4: Minimum and Maximum Fees Per Course
SELECT Course, MIN(Fees) AS MinFees, MAX(Fees) AS MaxFees
FROM Students
GROUP BY Course;Result:
| Course | MinFees | MaxFees |
|---|---|---|
| Science | 38000 | 48000 |
| Commerce | 52000 | 60000 |
| Arts | 42000 | 42000 |
GROUP BY With WHERE
The WHERE clause filters rows before grouping happens. Only the rows that pass the filter are included in the groups.
SELECT City, COUNT(*) AS TotalStudents
FROM Students
WHERE Course = 'Science'
GROUP BY City;This first filters only Science students, and then counts them per city.
Result:
| City | TotalStudents |
|---|---|
| Delhi | 1 |
| Chennai | 1 |
| Mumbai | 1 |
GROUP BY With Multiple Columns
Grouping by more than one column creates groups based on the unique combination of those columns.
SELECT City, Course, COUNT(*) AS Count
FROM Students
GROUP BY City, Course;This groups by each unique City + Course combination — for example, "Delhi + Science" is one group, "Delhi + Arts" is another.
GROUP BY With ORDER BY
SELECT City, COUNT(*) AS TotalStudents
FROM Students
GROUP BY City
ORDER BY TotalStudents DESC;This counts students per city and sorts from the city with the most students to the fewest.
Important Rule: Columns in SELECT Must Be in GROUP BY
Any column in the SELECT clause that is not inside an aggregate function must appear in the GROUP BY clause.
-- This is CORRECT
SELECT City, COUNT(*) FROM Students GROUP BY City;
-- This will cause an ERROR in strict mode
SELECT City, StudentName, COUNT(*) FROM Students GROUP BY City;StudentName cannot be selected without being in GROUP BY, because there are multiple student names per city and SQL does not know which one to show.
Key Points to Remember
GROUP BYarranges rows into groups based on one or more column values.- It is almost always used with aggregate functions:
COUNT(),SUM(),AVG(),MIN(),MAX(). WHEREfilters rows before grouping;HAVINGfilters after grouping (covered in the next topic).- Every non-aggregated column in
SELECTmust be listed inGROUP BY. - Multiple columns can be grouped together for a more specific breakdown.
Summary
GROUP BY is a powerful clause that summarizes data by organizing rows into meaningful groups. When paired with aggregate functions, it enables counting, summing, averaging, and finding minimum/maximum values for each group. It is the foundation of all reporting and data analysis queries in SQL.
