PostgreSQL INTERSECT
Introduction
When working with databases, you'll often need to find data that exists in multiple tables or query results. The INTERSECT
operator in PostgreSQL is a powerful tool that helps you identify common records between two or more query results. It returns only the rows that appear in all of the specified queries.
Think of INTERSECT
as the database equivalent of a Venn diagram intersection - it gives you the overlapping data between different sets of results.
In this tutorial, we'll explore the PostgreSQL INTERSECT
operator, understand its syntax, and see practical examples of how it can be used in real-world scenarios.
Basic Syntax
The basic syntax of the INTERSECT
operator is:
SELECT column1, column2, ... FROM table1
INTERSECT
SELECT column1, column2, ... FROM table2;
Important rules to remember:
- The number and order of columns must be the same in all queries
- The data types of corresponding columns must be compatible
- By default, duplicate rows are eliminated
Basic INTERSECT Examples
Let's set up some sample tables to demonstrate how INTERSECT
works:
-- Create and populate employees_department_a table
CREATE TABLE employees_department_a (
employee_id INT PRIMARY KEY,
name VARCHAR(50),
position VARCHAR(50)
);
INSERT INTO employees_department_a VALUES
(1, 'John Smith', 'Developer'),
(2, 'Jane Doe', 'Designer'),
(3, 'Mike Johnson', 'Manager'),
(4, 'Sarah Brown', 'Developer');
-- Create and populate employees_department_b table
CREATE TABLE employees_department_b (
employee_id INT PRIMARY KEY,
name VARCHAR(50),
position VARCHAR(50)
);
INSERT INTO employees_department_b VALUES
(3, 'Mike Johnson', 'Manager'),
(4, 'Sarah Brown', 'Developer'),
(5, 'Emily Wilson', 'Designer'),
(6, 'David Lee', 'Analyst');
Now, let's find employees who work in both departments:
SELECT employee_id, name, position FROM employees_department_a
INTERSECT
SELECT employee_id, name, position FROM employees_department_b;
Result:
employee_id | name | position
-------------|--------------|-----------
3 | Mike Johnson | Manager
4 | Sarah Brown | Developer
The result shows that Mike Johnson and Sarah Brown work in both departments.
INTERSECT with ORDER BY
You can use the ORDER BY
clause with INTERSECT
to sort the results:
SELECT employee_id, name, position FROM employees_department_a
INTERSECT
SELECT employee_id, name, position FROM employees_department_b
ORDER BY name;
Result:
employee_id | name | position
-------------|--------------|-----------
3 | Mike Johnson | Manager
4 | Sarah Brown | Developer
Note that the ORDER BY
clause should be placed at the end of the entire query, not within individual SELECT
statements.
INTERSECT ALL
By default, INTERSECT
removes duplicate rows from the result. If you want to keep duplicates, you can use INTERSECT ALL
:
-- Create tables with duplicate rows
CREATE TABLE set_a (id INT);
CREATE TABLE set_b (id INT);
INSERT INTO set_a VALUES (1), (2), (2), (3), (4);
INSERT INTO set_b VALUES (2), (2), (3), (5), (6);
-- Using INTERSECT (removes duplicates)
SELECT id FROM set_a
INTERSECT
SELECT id FROM set_b;
-- Result:
-- 2
-- 3
-- Using INTERSECT ALL (preserves duplicates)
SELECT id FROM set_a
INTERSECT ALL
SELECT id FROM set_b;
-- Result:
-- 2
-- 2
-- 3
In the INTERSECT ALL
result, the value 2
appears twice because it occurs at least twice in both tables.
Multiple INTERSECT Operations
You can chain multiple INTERSECT
operations to find rows common to more than two queries:
-- Create a third table
CREATE TABLE employees_department_c (
employee_id INT PRIMARY KEY,
name VARCHAR(50),
position VARCHAR(50)
);
INSERT INTO employees_department_c VALUES
(3, 'Mike Johnson', 'Manager'),
(6, 'David Lee', 'Analyst'),
(7, 'Lisa Chen', 'Developer');
-- Find employees who work in all three departments
SELECT employee_id, name, position FROM employees_department_a
INTERSECT
SELECT employee_id, name, position FROM employees_department_b
INTERSECT
SELECT employee_id, name, position FROM employees_department_c;
Result:
employee_id | name | position
-------------|--------------|----------
3 | Mike Johnson | Manager
Mike Johnson is the only employee who works in all three departments.
Real-World Applications
Let's explore some practical applications of INTERSECT
in real-world scenarios:
Finding Common Customers
Suppose you want to identify customers who have purchased both Product A and Product B:
-- Customers who purchased Product A
SELECT customer_id FROM orders WHERE product_id = 'A'
INTERSECT
-- Customers who purchased Product B
SELECT customer_id FROM orders WHERE product_id = 'B';
Identifying Shared Skills
Find candidates who have all the required skills for a job position:
-- Required skills for the job
SELECT skill_id FROM job_requirements WHERE job_id = 123
INTERSECT
-- Skills possessed by the candidate
SELECT skill_id FROM candidate_skills WHERE candidate_id = 456;
Finding Overlapping Time Periods
Identify dates when two teams were both available:
-- Dates when Team A is available
SELECT available_date FROM team_a_availability
INTERSECT
-- Dates when Team B is available
SELECT available_date FROM team_b_availability;
Combining INTERSECT with Other Set Operations
You can combine INTERSECT
with other set operations like UNION
and EXCEPT
in the same query:
-- Find employees who work in both Department A and B, or only in Department C
(SELECT employee_id FROM employees_department_a
INTERSECT
SELECT employee_id FROM employees_department_b)
UNION
SELECT employee_id FROM employees_department_c;
Note that when combining set operations, it's important to use parentheses to establish the order of operations.
Performance Considerations
When using INTERSECT
, keep these performance tips in mind:
- Indexes: Ensure that columns used in the
INTERSECT
operation are properly indexed - Query Optimization: PostgreSQL might rewrite your query to use other constructs that are more efficient
- Alternative Approaches: For some queries, using
JOIN
orIN
might be more efficient thanINTERSECT
For example, the following queries are equivalent:
-- Using INTERSECT
SELECT employee_id FROM employees_department_a
INTERSECT
SELECT employee_id FROM employees_department_b;
-- Alternative using JOIN
SELECT DISTINCT a.employee_id
FROM employees_department_a a
JOIN employees_department_b b ON a.employee_id = b.employee_id;
-- Alternative using IN
SELECT DISTINCT employee_id
FROM employees_department_a
WHERE employee_id IN (SELECT employee_id FROM employees_department_b);
Choose the approach that works best for your specific scenario and data volume.
Common Mistakes and Troubleshooting
Here are some common issues when working with INTERSECT
:
-
Column count mismatch: Ensure that both queries return the same number of columns
sql-- This will fail
SELECT id, name FROM table_a
INTERSECT
SELECT id FROM table_b; -
Data type incompatibility: The data types of corresponding columns must be compatible
sql-- This might fail if the data types cannot be implicitly converted
SELECT id::TEXT FROM table_a
INTERSECT
SELECT id FROM table_b; -
Forgetting about case sensitivity: Remember that string comparisons are case-sensitive
sql-- 'John' and 'JOHN' are considered different values
SELECT name FROM table_a
INTERSECT
SELECT name FROM table_b;
Summary
The PostgreSQL INTERSECT
operator is a powerful tool for finding common rows between multiple queries. It helps you identify overlapping data sets, making it useful for a wide range of applications from business analysis to data validation.
Key points to remember:
INTERSECT
returns only rows that appear in all of the specified queries- Column count and data types must match between queries
- Use
INTERSECT ALL
to preserve duplicates - Consider performance implications and alternative approaches for complex queries
INTERSECT
can be combined with other set operations for more complex data analysis
Exercises
- Create two tables of your choice and practice using the
INTERSECT
operator to find common records. - Experiment with
INTERSECT ALL
to see how it differs from regularINTERSECT
. - Try combining
INTERSECT
withUNION
andEXCEPT
to solve a complex data retrieval problem. - Compare the performance of
INTERSECT
versus equivalent queries usingJOIN
orIN
on large datasets. - Find a real-world problem in your own projects where
INTERSECT
could be applied.
Additional Resources
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)