SQL Joins Overview
Introduction
SQL joins are powerful features that allow you to combine rows from two or more tables based on a related column between them. Joins are essential for working with relational databases, as they enable you to retrieve data from multiple tables in a single query, establishing relationships between different datasets.
In real-world applications, data is often distributed across multiple tables to maintain database normalization principles. For example, an e-commerce database might have separate tables for customers, orders, and products. Joins allow you to connect these tables to answer questions like "What products did customer X order?" or "Which customers purchased product Y?"
Types of SQL Joins
There are four main types of SQL joins:
- INNER JOIN: Returns records that have matching values in both tables
- LEFT JOIN (or LEFT OUTER JOIN): Returns all records from the left table and matched records from the right table
- RIGHT JOIN (or RIGHT OUTER JOIN): Returns all records from the right table and matched records from the left table
- FULL JOIN (or FULL OUTER JOIN): Returns all records when there is a match in either the left or right table
Let's visualize these joins using Venn diagrams:
Setting Up Example Tables
Before diving into the specifics of each join type, let's create two sample tables to work with throughout this tutorial.
-- Create Customers table
CREATE TABLE Customers (
customer_id INT PRIMARY KEY,
first_name VARCHAR(50),
last_name VARCHAR(50),
email VARCHAR(100)
);
-- Create Orders table
CREATE TABLE Orders (
order_id INT PRIMARY KEY,
customer_id INT,
order_date DATE,
total_amount DECIMAL(10, 2)
);
-- Insert sample data into Customers
INSERT INTO Customers VALUES
(1, 'John', 'Doe', '[email protected]'),
(2, 'Jane', 'Smith', '[email protected]'),
(3, 'Robert', 'Johnson', '[email protected]'),
(4, 'Emily', 'Williams', '[email protected]');
-- Insert sample data into Orders
INSERT INTO Orders VALUES
(101, 1, '2023-01-15', 150.75),
(102, 2, '2023-01-16', 89.99),
(103, 1, '2023-01-20', 45.50),
(104, 3, '2023-01-25', 210.25),
(105, NULL, '2023-01-28', 55.00);
Let's examine what our tables look like:
Customers Table:
customer_id | first_name | last_name | |
---|---|---|---|
1 | John | Doe | [email protected] |
2 | Jane | Smith | [email protected] |
3 | Robert | Johnson | [email protected] |
4 | Emily | Williams | [email protected] |
Orders Table:
order_id | customer_id | order_date | total_amount |
---|---|---|---|
101 | 1 | 2023-01-15 | 150.75 |
102 | 2 | 2023-01-16 | 89.99 |
103 | 1 | 2023-01-20 | 45.50 |
104 | 3 | 2023-01-25 | 210.25 |
105 | NULL | 2023-01-28 | 55.00 |
Notice that:
- Customer with ID 4 (Emily) has no orders
- Order with ID 105 has no associated customer (NULL customer_id)
These scenarios will help demonstrate the differences between join types.
INNER JOIN
An INNER JOIN returns only the rows where there is a match in both tables based on the specified condition. This is the most common type of join.
Syntax
SELECT columns
FROM table1
INNER JOIN table2
ON table1.column = table2.column;
Example
Let's retrieve a list of customers and their orders:
SELECT
c.customer_id,
c.first_name,
c.last_name,
o.order_id,
o.order_date,
o.total_amount
FROM
Customers c
INNER JOIN
Orders o
ON
c.customer_id = o.customer_id;
Result
customer_id | first_name | last_name | order_id | order_date | total_amount |
---|---|---|---|---|---|
1 | John | Doe | 101 | 2023-01-15 | 150.75 |
1 | John | Doe | 103 | 2023-01-20 | 45.50 |
2 | Jane | Smith | 102 | 2023-01-16 | 89.99 |
3 | Robert | Johnson | 104 | 2023-01-25 | 210.25 |
Explanation
Notice that:
- Only matching rows are returned
- Emily (customer_id = 4) doesn't appear because she has no orders
- Order 105 doesn't appear because it has no associated customer
- John Doe appears twice because he has two separate orders
LEFT JOIN
A LEFT JOIN (or LEFT OUTER JOIN) returns all records from the left table and matching records from the right table. If there's no match in the right table, NULL values are returned for right table columns.
Syntax
SELECT columns
FROM table1
LEFT JOIN table2
ON table1.column = table2.column;
Example
Let's retrieve all customers and their orders (if any):
SELECT
c.customer_id,
c.first_name,
c.last_name,
o.order_id,
o.order_date,
o.total_amount
FROM
Customers c
LEFT JOIN
Orders o
ON
c.customer_id = o.customer_id;
Result
customer_id | first_name | last_name | order_id | order_date | total_amount |
---|---|---|---|---|---|
1 | John | Doe | 101 | 2023-01-15 | 150.75 |
1 | John | Doe | 103 | 2023-01-20 | 45.50 |
2 | Jane | Smith | 102 | 2023-01-16 | 89.99 |
3 | Robert | Johnson | 104 | 2023-01-25 | 210.25 |
4 | Emily | Williams | NULL | NULL | NULL |
Explanation
Notice that:
- All customers are included in the result
- Emily appears with NULL values for order information since she has no orders
- The "left" table in this case is Customers
RIGHT JOIN
A RIGHT JOIN (or RIGHT OUTER JOIN) returns all records from the right table and matching records from the left table. If there's no match in the left table, NULL values are returned for left table columns.
Syntax
SELECT columns
FROM table1
RIGHT JOIN table2
ON table1.column = table2.column;
Example
Let's retrieve all orders and their associated customers (if any):
SELECT
c.customer_id,
c.first_name,
c.last_name,
o.order_id,
o.order_date,
o.total_amount
FROM
Customers c
RIGHT JOIN
Orders o
ON
c.customer_id = o.customer_id;
Result
customer_id | first_name | last_name | order_id | order_date | total_amount |
---|---|---|---|---|---|
1 | John | Doe | 101 | 2023-01-15 | 150.75 |
2 | Jane | Smith | 102 | 2023-01-16 | 89.99 |
1 | John | Doe | 103 | 2023-01-20 | 45.50 |
3 | Robert | Johnson | 104 | 2023-01-25 | 210.25 |
NULL | NULL | NULL | 105 | 2023-01-28 | 55.00 |
Explanation
Notice that:
- All orders are included in the result
- Order 105 appears with NULL values for customer information since it has no associated customer
- The "right" table in this case is Orders
FULL JOIN
A FULL JOIN (or FULL OUTER JOIN) returns all records when there is a match in either the left or right table. If there's no match, NULL values are returned for columns from the table without a match.
FULL JOIN is not supported in all database systems. For example, MySQL doesn't directly support FULL JOIN, but you can simulate it using a combination of LEFT JOIN, UNION, and RIGHT JOIN.
Syntax
SELECT columns
FROM table1
FULL JOIN table2
ON table1.column = table2.column;
Example
Let's retrieve all customers and all orders, regardless of whether there's a match:
SELECT
c.customer_id,
c.first_name,
c.last_name,
o.order_id,
o.order_date,
o.total_amount
FROM
Customers c
FULL JOIN
Orders o
ON
c.customer_id = o.customer_id;
Result
customer_id | first_name | last_name | order_id | order_date | total_amount |
---|---|---|---|---|---|
1 | John | Doe | 101 | 2023-01-15 | 150.75 |
1 | John | Doe | 103 | 2023-01-20 | 45.50 |
2 | Jane | Smith | 102 | 2023-01-16 | 89.99 |
3 | Robert | Johnson | 104 | 2023-01-25 | 210.25 |
4 | Emily | Williams | NULL | NULL | NULL |
NULL | NULL | NULL | 105 | 2023-01-28 | 55.00 |
Explanation
Notice that:
- All customers and all orders are included in the result
- Emily appears with NULL values for order information
- Order 105 appears with NULL values for customer information