MySQL UNION
Introduction
The MySQL UNION
operator is a powerful feature that allows you to combine the results from multiple SELECT
statements into a single result set. This capability is essential when you need to merge data from different tables that have similar structures or when you want to create comprehensive reports that pull information from various parts of your database.
In this tutorial, you'll learn:
- What the
UNION
operator is and how it works - The syntax and rules for using
UNION
- The difference between
UNION
andUNION ALL
- Practical examples of
UNION
in real-world scenarios - Best practices and common pitfalls
Understanding UNION
The UNION
operator works by combining the results of two or more SELECT
statements into a single result set. Think of it as stacking the results on top of each other (vertically), as opposed to joining tables side by side (horizontally) like with JOIN
operations.
Key Rules for UNION
- The number of columns in each
SELECT
statement must be the same - The corresponding columns must have compatible data types
- The column names from the first
SELECT
statement will be used as the column names in the result - By default,
UNION
eliminates duplicate rows (useUNION ALL
to keep duplicates)
Basic Syntax
SELECT column1, column2, ... FROM table1
UNION
SELECT column1, column2, ... FROM table2;
UNION vs. UNION ALL
UNION
The UNION
operator removes duplicate records from the combined result:
SELECT column1, column2 FROM table1
UNION
SELECT column1, column2 FROM table2;
UNION ALL
The UNION ALL
operator includes all rows, including duplicates, which makes it faster than UNION
since it doesn't need to check for duplicates:
SELECT column1, column2 FROM table1
UNION ALL
SELECT column1, column2 FROM table2;
Examples of UNION in Action
Let's work with some practical examples to understand how UNION
works.
Example 1: Combining Customer Lists
Imagine you have two tables: online_customers
and retail_customers
. You want to create a single list of all customers:
-- Create sample tables
CREATE TABLE online_customers (
customer_id INT PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(100),
signup_date DATE
);
CREATE TABLE retail_customers (
customer_id INT PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(100),
first_purchase_date DATE
);
-- Insert sample data
INSERT INTO online_customers VALUES
(1, 'John Doe', '[email protected]', '2023-01-15'),
(2, 'Jane Smith', '[email protected]', '2023-02-20'),
(3, 'Robert Johnson', '[email protected]', '2023-03-10');
INSERT INTO retail_customers VALUES
(4, 'Mary Williams', '[email protected]', '2023-01-05'),
(5, 'James Brown', '[email protected]', '2023-02-12'),
(3, 'Robert Johnson', '[email protected]', '2023-04-15');
-- Using UNION to combine customer lists
SELECT customer_id, name, email, signup_date AS registration_date, 'Online' AS customer_type
FROM online_customers
UNION
SELECT customer_id, name, email, first_purchase_date, 'Retail'
FROM retail_customers
ORDER BY name;
Output:
+-------------+----------------+--------------------+------------------+---------------+
| customer_id | name | email | registration_date| customer_type |
+-------------+----------------+--------------------+------------------+---------------+
| 5 | James Brown | [email protected] | 2023-02-12 | Retail |
| 2 | Jane Smith | [email protected] | 2023-02-20 | Online |
| 1 | John Doe | [email protected] | 2023-01-15 | Online |
| 4 | Mary Williams | [email protected] | 2023-01-05 | Retail |
| 3 | Robert Johnson | [email protected] | 2023-03-10 | Online |
+-------------+----------------+--------------------+------------------+---------------+
Notice that Robert Johnson appears only once in the results, even though he exists in both tables. That's because UNION
automatically removes duplicates.
Example 2: Using UNION ALL to Keep Duplicates
If we want to see all records including duplicates:
-- Using UNION ALL to keep duplicates
SELECT customer_id, name, email, signup_date AS registration_date, 'Online' AS customer_type
FROM online_customers
UNION ALL
SELECT customer_id, name, email, first_purchase_date, 'Retail'
FROM retail_customers
ORDER BY name;
Output:
+-------------+----------------+--------------------+------------------+---------------+
| customer_id | name | email | registration_date| customer_type |
+-------------+----------------+--------------------+------------------+---------------+
| 5 | James Brown | [email protected] | 2023-02-12 | Retail |
| 2 | Jane Smith | [email protected] | 2023-02-20 | Online |
| 1 | John Doe | [email protected] | 2023-01-15 | Online |
| 4 | Mary Williams | [email protected] | 2023-01-05 | Retail |
| 3 | Robert Johnson | [email protected] | 2023-03-10 | Online |
| 3 | Robert Johnson | [email protected] | 2023-04-15 | Retail |
+-------------+----------------+--------------------+------------------+---------------+
Now Robert Johnson appears twice because we used UNION ALL
.
Example 3: Combining Sales Data from Different Periods
Let's see how we can use UNION
to create useful reports:
-- Create sales tables
CREATE TABLE q1_sales (
order_id INT PRIMARY KEY,
product_name VARCHAR(100),
quantity INT,
amount DECIMAL(10,2),
sale_date DATE
);
CREATE TABLE q2_sales (
order_id INT PRIMARY KEY,
product_name VARCHAR(100),
quantity INT,
amount DECIMAL(10,2),
sale_date DATE
);
-- Insert sample data
INSERT INTO q1_sales VALUES
(101, 'Laptop', 2, 2400.00, '2023-01-15'),
(102, 'Mouse', 10, 250.00, '2023-02-20'),
(103, 'Keyboard', 5, 300.00, '2023-03-10');
INSERT INTO q2_sales VALUES
(201, 'Laptop', 3, 3600.00, '2023-04-05'),
(202, 'Monitor', 4, 800.00, '2023-05-12'),
(203, 'Headphones', 8, 560.00, '2023-06-22');
-- Generate a half-year sales report
SELECT product_name, SUM(quantity) AS total_quantity, SUM(amount) AS total_sales, 'Q1' AS quarter
FROM q1_sales
GROUP BY product_name
UNION
SELECT product_name, SUM(quantity) AS total_quantity, SUM(amount) AS total_sales, 'Q2' AS quarter
FROM q2_sales
GROUP BY product_name
ORDER BY product_name, quarter;
Output:
+-------------+----------------+-------------+---------+
| product_name| total_quantity | total_sales | quarter |
+-------------+----------------+-------------+---------+
| Headphones | 8 | 560.00 | Q2 |
| Keyboard | 5 | 300.00 | Q1 |
| Laptop | 2 | 2400.00 | Q1 |
| Laptop | 3 | 3600.00 | Q2 |
| Monitor | 4 | 800.00 | Q2 |
| Mouse | 10 | 250.00 | Q1 |
+-------------+----------------+-------------+---------+
Real-world Applications
Application 1: Compiling Reports Across Multiple Regions
-- Sales data for different regions in different tables
SELECT product_id, product_name, SUM(sales_amount) AS total_sales, 'North' AS region
FROM north_region_sales
WHERE sale_date BETWEEN '2023-01-01' AND '2023-06-30'
GROUP BY product_id, product_name
UNION
SELECT product_id, product_name, SUM(sales_amount) AS total_sales, 'South' AS region
FROM south_region_sales
WHERE sale_date BETWEEN '2023-01-01' AND '2023-06-30'
GROUP BY product_id, product_name
UNION
SELECT product_id, product_name, SUM(sales_amount) AS total_sales, 'East' AS region
FROM east_region_sales
WHERE sale_date BETWEEN '2023-01-01' AND '2023-06-30'
GROUP BY product_id, product_name
UNION
SELECT product_id, product_name, SUM(sales_amount) AS total_sales, 'West' AS region
FROM west_region_sales
WHERE sale_date BETWEEN '2023-01-01' AND '2023-06-30'
GROUP BY product_id, product_name
ORDER BY total_sales DESC, region;
Application 2: Search Functionality Across Multiple Tables
-- Search for a term across customers, products, and suppliers tables
SELECT 'Customer' AS source, customer_id AS id, company_name AS name, city, country
FROM customers
WHERE company_name LIKE '%Computer%' OR contact_name LIKE '%Computer%'
UNION
SELECT 'Product' AS source, product_id AS id, product_name AS name, category_name AS city, supplier_name AS country
FROM products
JOIN categories ON products.category_id = categories.category_id
JOIN suppliers ON products.supplier_id = suppliers.supplier_id
WHERE product_name LIKE '%Computer%' OR description LIKE '%Computer%'
UNION
SELECT 'Supplier' AS source, supplier_id AS id, company_name AS name, city, country
FROM suppliers
WHERE company_name LIKE '%Computer%' OR contact_name LIKE '%Computer%'
ORDER BY source, name;
Best Practices and Tips
1. Column Compatibility
When using UNION
, ensure that corresponding columns have compatible data types. MySQL will attempt to perform implicit type conversion when needed, but it's better to explicitly convert data types to avoid unexpected results:
SELECT id, CAST(value AS CHAR) AS value_str FROM table1
UNION
SELECT id, name AS value_str FROM table2;
2. Performance Considerations
- Use
UNION ALL
instead ofUNION
when you don't need to eliminate duplicates, as it's significantly faster - Consider creating indexes on columns used in the
WHERE
clauses of yourSELECT
statements - For large result sets, consider using temporary tables or materialized views
3. Column Naming and ORDER BY
The column names in a UNION
result come from the first SELECT
statement. You can use column aliases to make the output more readable:
SELECT emp_id AS id, first_name AS name, 'Employee' AS type FROM employees
UNION
SELECT cust_id AS id, company_name AS name, 'Customer' AS type FROM customers
ORDER BY type, name;
Remember that the ORDER BY
clause applies to the entire combined result set and should be placed at the end of the query.
Common Errors and Troubleshooting
1. Different Number of Columns
If you get the error "The used SELECT statements have a different number of columns," ensure that each SELECT
statement in your UNION
has the same number of columns:
-- Wrong
SELECT id, name FROM employees
UNION
SELECT id, name, department FROM departments;
-- Correct
SELECT id, name, NULL AS department FROM employees
UNION
SELECT id, name, department FROM departments;
2. Incompatible Column Data Types
If you encounter issues with data types, use explicit conversion:
SELECT id, CAST(price AS CHAR) AS price_info FROM products
UNION
SELECT id, description AS price_info FROM product_details;
Summary
The MySQL UNION
operator is a powerful tool for combining results from multiple SELECT
statements into a single result set. It's particularly useful when you need to:
- Combine data from tables with similar structures
- Create comprehensive reports from different data sources
- Implement search functionality across multiple tables
- Generate normalized output from denormalized data
Remember the key differences between UNION
(removes duplicates) and UNION ALL
(keeps all rows including duplicates), and ensure that your SELECT
statements have the same number of columns with compatible data types.
Exercises
-
Create two tables:
current_employees
andformer_employees
with columns for employee ID, name, position, and department. UseUNION
to create a complete employee history report. -
Write a query that combines sales data from three different years (stored in separate tables) into a single report showing year-over-year growth.
-
Implement a search feature that looks for a user-provided term in product names, descriptions, and categories, returning a unified list of matches.
-
Create a query that combines active and archived customer records, but makes sure to flag duplicate entries where a customer appears in both tables.
Additional Resources
- MySQL Documentation on UNION
- SQL Joins vs. UNION: Understanding the Difference
- Performance Optimization for UNION Queries
Happy coding!
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)