Subquery in SQL

Subquery in SQL

A subquery is a query nested inside another query. Subqueries allow you to perform an intermediate query and use its result within the main (outer) query. They are a powerful tool for breaking down complex SQL problems into smaller, more manageable parts.


Key Points

  1. Nested Query:
    Subqueries are written inside parentheses and can be used in SELECT, WHERE, FROM, or HAVING clauses.

  2. Intermediate Results:
    The result of the subquery is passed to the outer query for further processing.

  3. Can Be Correlated or Uncorrelated:

    • Uncorrelated Subqueries: Execute independently of the outer query.

    • Correlated Subqueries: Depend on the outer query for execution.

  4. Improves Readability:
    Breaks down complex queries for better understanding.


How to Explain to a Beginner

Think of a subquery like solving a multi-step math problem. You first calculate an intermediate value (subquery) and then use that result in the main solution (outer query). For instance, if you need to find the employees who earn more than the average salary, the subquery calculates the average salary, and the outer query finds employees with salaries exceeding it.


Syntax

General Syntax

SELECT columns
FROM table_name
WHERE column_name [operator] (SELECT columns FROM another_table WHERE condition);

Examples

1. Subquery in the WHERE Clause

Find employees earning more than the average salary:

SELECT employee_id, name, salary
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);
  • Inner Query: Calculates the average salary.

  • Outer Query: Fetches employees with salaries greater than the result of the subquery.


2. Subquery in the SELECT Clause

Show each employee’s salary as a percentage of the total salary:

SELECT employee_id, name, 
       (salary / (SELECT SUM(salary) FROM employees)) * 100 AS salary_percentage
FROM employees;
  • The subquery calculates the total salary.

  • The outer query uses this to compute the salary percentage.


3. Subquery in the FROM Clause (Derived Table)

Find departments with an average salary greater than 50,000:

SELECT department_id, avg_salary
FROM (SELECT department_id, AVG(salary) AS avg_salary
      FROM employees
      GROUP BY department_id) AS dept_avg
WHERE avg_salary > 50000;
  • The subquery calculates the average salary for each department.

  • The outer query filters departments with an average salary exceeding 50,000.


4. Correlated Subquery

Find employees who earn more than the average salary of their department:

SELECT e1.name, e1.salary
FROM employees e1
WHERE e1.salary > (SELECT AVG(e2.salary)
                   FROM employees e2
                   WHERE e1.department_id = e2.department_id);
  • The inner query calculates the average salary for each department.

  • The outer query compares each employee's salary to the corresponding department average.


5. Subquery with EXISTS

Check if a department has any employees:

SELECT department_id, department_name
FROM departments d
WHERE EXISTS (SELECT 1
              FROM employees e
              WHERE e.department_id = d.department_id);
  • The subquery checks if rows exist in the employees table for each department.

When to Use Subqueries

  • To simplify complex queries by breaking them into smaller, logical steps.

  • When intermediate results are needed for filtering, calculations, or conditions.

  • To compare a value with a dynamically calculated result.


Common Mistakes to Avoid

  1. Returning Multiple Rows Where One is Expected:

     SELECT employee_id
     FROM employees
     WHERE salary = (SELECT salary FROM employees WHERE department_id = 1);
    
    • If the subquery returns more than one row, you’ll get an error. Use IN instead:
    WHERE salary IN (SELECT salary FROM employees WHERE department_id = 1);
  1. Performance Issues with Correlated Subqueries:

    • Correlated subqueries can be slow since they execute once for each row in the outer query.

    • Consider using a JOIN or derived table for better performance.


Tips for Beginners

  • Start Simple: Begin with uncorrelated subqueries in the WHERE clause.

  • Visualize the Steps: Break down the problem and write the subquery first.

  • Test Separately: Test the subquery independently to ensure it returns the expected result.