The GROUP BY
clause in SQL is used to organize data into groups based on one or more columns. It’s typically paired with aggregate functions (e.g., SUM
, COUNT
, AVG
, MAX
, MIN
) to perform calculations for each group.
Think of it as sorting data into buckets, where each bucket represents a unique group defined by the values in the specified column(s).
Key Points
Grouping Data:
It groups rows with the same value in one or more columns into a single group.Works with Aggregate Functions:
You use aggregate functions likeCOUNT
,SUM
, orAVG
to perform calculations on each group.Order Matters:
TheGROUP BY
clause must come after theWHERE
clause but before theORDER BY
clause in a query.
How to Explain to a Beginner
Imagine you have a box of candies with different flavors: chocolate, strawberry, and vanilla. You want to count how many candies there are for each flavor.
Columns in a table = Candies
GROUP BY = Organizing candies by flavor
COUNT = Counting candies in each flavor group
Syntax
Basic Syntax
SELECT column1, aggregate_function(column2)
FROM table_name
GROUP BY column1;
With Multiple Columns
SELECT column1, column2, aggregate_function(column3)
FROM table_name
GROUP BY column1, column2;
Example
Simple Example
SELECT department, COUNT(*)
FROM employees
GROUP BY department;
- Groups employees by their
department
and counts the number of employees in each department.
Using Multiple Columns
SELECT department, job_title, AVG(salary)
FROM employees
GROUP BY department, job_title;
- Groups employees by both
department
andjob_title
and calculates the average salary for each group.
Combining with ORDER BY
SELECT department, SUM(salary)
FROM employees
GROUP BY department
ORDER BY SUM(salary) DESC;
- Groups employees by
department
, calculates the total salary for each department, and sorts the result in descending order.
When to Use GROUP BY
To summarize data, like counting or averaging values for specific groups.
To identify patterns or trends within a dataset.
To prepare data for reports or dashboards.