GROUP BY in SQL

GROUP BY in SQL

The GROUP BY clause in SQL is used to organize data into groups based on one or more columns. It’s typically paired with aggregate functions (e.g., SUM, COUNT, AVG, MAX, MIN) to perform calculations for each group.

Think of it as sorting data into buckets, where each bucket represents a unique group defined by the values in the specified column(s).

Key Points

  1. Grouping Data:
    It groups rows with the same value in one or more columns into a single group.

  2. Works with Aggregate Functions:
    You use aggregate functions like COUNT, SUM, or AVG to perform calculations on each group.

  3. Order Matters:
    The GROUP BY clause must come after the WHERE clause but before the ORDER BY clause in a query.

How to Explain to a Beginner

Imagine you have a box of candies with different flavors: chocolate, strawberry, and vanilla. You want to count how many candies there are for each flavor.

  • Columns in a table = Candies

  • GROUP BY = Organizing candies by flavor

  • COUNT = Counting candies in each flavor group


Syntax

Basic Syntax

SELECT column1, aggregate_function(column2)
FROM table_name
GROUP BY column1;

With Multiple Columns

SELECT column1, column2, aggregate_function(column3)
FROM table_name
GROUP BY column1, column2;

Example

Simple Example

SELECT department, COUNT(*)
FROM employees
GROUP BY department;
  • Groups employees by their department and counts the number of employees in each department.

Using Multiple Columns

SELECT department, job_title, AVG(salary)
FROM employees
GROUP BY department, job_title;
  • Groups employees by both department and job_title and calculates the average salary for each group.

Combining with ORDER BY

SELECT department, SUM(salary)
FROM employees
GROUP BY department
ORDER BY SUM(salary) DESC;
  • Groups employees by department, calculates the total salary for each department, and sorts the result in descending order.

When to Use GROUP BY

  • To summarize data, like counting or averaging values for specific groups.

  • To identify patterns or trends within a dataset.

  • To prepare data for reports or dashboards.