PostgreSQL: Filtering Groups with the HAVING Clause
Filter grouped data in PostgreSQL using the HAVING clause. This guide explains how to apply conditions to aggregated data.
Filtering Groups in PostgreSQL with the `HAVING` Clause
Understanding the `HAVING` Clause
In SQL, the `HAVING` clause filters groups of rows in a query's result set. It's typically used with the `GROUP BY` clause, applying conditions to aggregate functions (like `SUM()`, `COUNT()`, `AVG()`, etc.). The `HAVING` clause allows you to filter the groups that are created by the `GROUP BY` clause, making it a powerful way to analyze data at an aggregate level.
`HAVING` Clause Syntax
The basic syntax is:
SELECT column1, aggregate_function(column2) FROM table1 [JOIN table2 ON join_condition] WHERE condition GROUP BY column1 HAVING condition ORDER BY column1;
Where:
SELECT
: The columns to include in the results.FROM
: The tables being queried.WHERE
(optional): Filters individual rows *before* grouping.GROUP BY
: Groups rows based on specified columns.HAVING
: Filters the resulting groups.ORDER BY
(optional): Sorts the results.
Note: Column aliases from the `SELECT` clause cannot be used in the `HAVING` clause because the aliases aren't defined until after the `HAVING` clause is evaluated.
`HAVING` vs. `WHERE` Clauses
Clause | Description |
---|---|
WHERE |
Filters individual rows before grouping. |
HAVING |
Filters groups of rows after grouping (using aggregate functions). |
Examples: Using `HAVING` with Aggregate Functions
These examples assume an existing table named `employee` with columns like `emp_id`, `first_name`, and `salary`. You'd need to replace this with your own table and columns. The examples use `SUM()` and `COUNT()` aggregate functions with the `HAVING` clause to filter groups. The `ORDER BY` clause is used for sorting.
Example 1: Filtering by `SUM()`
This example calculates the total salary for each employee and then filters those employees whose total salary is above a specific threshold.
SQL Query
SELECT
emp_id,
first_name,
SUM(salary) AS total_salary
FROM
employee
GROUP BY
first_name, emp_id
HAVING
SUM(salary) > 25000
ORDER BY
first_name DESC;
Example 2: Filtering by `COUNT()`
This example counts the number of employees with each first name and then filters the names that appear less than twice.
SQL Query
SELECT
first_name,
COUNT(emp_id) AS employee_count
FROM
employee
GROUP BY
first_name
HAVING
COUNT(emp_id) < 2;
Conclusion
The `HAVING` clause in PostgreSQL is a powerful tool for filtering groups of rows based on aggregate functions. It's essential for data analysis and reporting that requires filtering data at the group level. Understanding how to use `HAVING` effectively enhances your ability to extract valuable insights from your database.