Skip to main content

MySQL UNION

Introduction

The MySQL UNION operator is a powerful feature that allows you to combine the results from multiple SELECT statements into a single result set. This capability is essential when you need to merge data from different tables that have similar structures or when you want to create comprehensive reports that pull information from various parts of your database.

In this tutorial, you'll learn:

  • What the UNION operator is and how it works
  • The syntax and rules for using UNION
  • The difference between UNION and UNION ALL
  • Practical examples of UNION in real-world scenarios
  • Best practices and common pitfalls

Understanding UNION

The UNION operator works by combining the results of two or more SELECT statements into a single result set. Think of it as stacking the results on top of each other (vertically), as opposed to joining tables side by side (horizontally) like with JOIN operations.

Key Rules for UNION

  1. The number of columns in each SELECT statement must be the same
  2. The corresponding columns must have compatible data types
  3. The column names from the first SELECT statement will be used as the column names in the result
  4. By default, UNION eliminates duplicate rows (use UNION ALL to keep duplicates)

Basic Syntax

sql
SELECT column1, column2, ... FROM table1
UNION
SELECT column1, column2, ... FROM table2;

UNION vs. UNION ALL

UNION

The UNION operator removes duplicate records from the combined result:

sql
SELECT column1, column2 FROM table1
UNION
SELECT column1, column2 FROM table2;

UNION ALL

The UNION ALL operator includes all rows, including duplicates, which makes it faster than UNION since it doesn't need to check for duplicates:

sql
SELECT column1, column2 FROM table1
UNION ALL
SELECT column1, column2 FROM table2;

Examples of UNION in Action

Let's work with some practical examples to understand how UNION works.

Example 1: Combining Customer Lists

Imagine you have two tables: online_customers and retail_customers. You want to create a single list of all customers:

sql
-- Create sample tables
CREATE TABLE online_customers (
customer_id INT PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(100),
signup_date DATE
);

CREATE TABLE retail_customers (
customer_id INT PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(100),
first_purchase_date DATE
);

-- Insert sample data
INSERT INTO online_customers VALUES
(1, 'John Doe', '[email protected]', '2023-01-15'),
(2, 'Jane Smith', '[email protected]', '2023-02-20'),
(3, 'Robert Johnson', '[email protected]', '2023-03-10');

INSERT INTO retail_customers VALUES
(4, 'Mary Williams', '[email protected]', '2023-01-05'),
(5, 'James Brown', '[email protected]', '2023-02-12'),
(3, 'Robert Johnson', '[email protected]', '2023-04-15');

-- Using UNION to combine customer lists
SELECT customer_id, name, email, signup_date AS registration_date, 'Online' AS customer_type
FROM online_customers
UNION
SELECT customer_id, name, email, first_purchase_date, 'Retail'
FROM retail_customers
ORDER BY name;

Output:

+-------------+----------------+--------------------+------------------+---------------+
| customer_id | name | email | registration_date| customer_type |
+-------------+----------------+--------------------+------------------+---------------+
| 5 | James Brown | [email protected] | 2023-02-12 | Retail |
| 2 | Jane Smith | [email protected] | 2023-02-20 | Online |
| 1 | John Doe | [email protected] | 2023-01-15 | Online |
| 4 | Mary Williams | [email protected] | 2023-01-05 | Retail |
| 3 | Robert Johnson | [email protected] | 2023-03-10 | Online |
+-------------+----------------+--------------------+------------------+---------------+

Notice that Robert Johnson appears only once in the results, even though he exists in both tables. That's because UNION automatically removes duplicates.

Example 2: Using UNION ALL to Keep Duplicates

If we want to see all records including duplicates:

sql
-- Using UNION ALL to keep duplicates
SELECT customer_id, name, email, signup_date AS registration_date, 'Online' AS customer_type
FROM online_customers
UNION ALL
SELECT customer_id, name, email, first_purchase_date, 'Retail'
FROM retail_customers
ORDER BY name;

Output:

+-------------+----------------+--------------------+------------------+---------------+
| customer_id | name | email | registration_date| customer_type |
+-------------+----------------+--------------------+------------------+---------------+
| 5 | James Brown | [email protected] | 2023-02-12 | Retail |
| 2 | Jane Smith | [email protected] | 2023-02-20 | Online |
| 1 | John Doe | [email protected] | 2023-01-15 | Online |
| 4 | Mary Williams | [email protected] | 2023-01-05 | Retail |
| 3 | Robert Johnson | [email protected] | 2023-03-10 | Online |
| 3 | Robert Johnson | [email protected] | 2023-04-15 | Retail |
+-------------+----------------+--------------------+------------------+---------------+

Now Robert Johnson appears twice because we used UNION ALL.

Example 3: Combining Sales Data from Different Periods

Let's see how we can use UNION to create useful reports:

sql
-- Create sales tables
CREATE TABLE q1_sales (
order_id INT PRIMARY KEY,
product_name VARCHAR(100),
quantity INT,
amount DECIMAL(10,2),
sale_date DATE
);

CREATE TABLE q2_sales (
order_id INT PRIMARY KEY,
product_name VARCHAR(100),
quantity INT,
amount DECIMAL(10,2),
sale_date DATE
);

-- Insert sample data
INSERT INTO q1_sales VALUES
(101, 'Laptop', 2, 2400.00, '2023-01-15'),
(102, 'Mouse', 10, 250.00, '2023-02-20'),
(103, 'Keyboard', 5, 300.00, '2023-03-10');

INSERT INTO q2_sales VALUES
(201, 'Laptop', 3, 3600.00, '2023-04-05'),
(202, 'Monitor', 4, 800.00, '2023-05-12'),
(203, 'Headphones', 8, 560.00, '2023-06-22');

-- Generate a half-year sales report
SELECT product_name, SUM(quantity) AS total_quantity, SUM(amount) AS total_sales, 'Q1' AS quarter
FROM q1_sales
GROUP BY product_name
UNION
SELECT product_name, SUM(quantity) AS total_quantity, SUM(amount) AS total_sales, 'Q2' AS quarter
FROM q2_sales
GROUP BY product_name
ORDER BY product_name, quarter;

Output:

+-------------+----------------+-------------+---------+
| product_name| total_quantity | total_sales | quarter |
+-------------+----------------+-------------+---------+
| Headphones | 8 | 560.00 | Q2 |
| Keyboard | 5 | 300.00 | Q1 |
| Laptop | 2 | 2400.00 | Q1 |
| Laptop | 3 | 3600.00 | Q2 |
| Monitor | 4 | 800.00 | Q2 |
| Mouse | 10 | 250.00 | Q1 |
+-------------+----------------+-------------+---------+

Real-world Applications

Application 1: Compiling Reports Across Multiple Regions

sql
-- Sales data for different regions in different tables
SELECT product_id, product_name, SUM(sales_amount) AS total_sales, 'North' AS region
FROM north_region_sales
WHERE sale_date BETWEEN '2023-01-01' AND '2023-06-30'
GROUP BY product_id, product_name

UNION

SELECT product_id, product_name, SUM(sales_amount) AS total_sales, 'South' AS region
FROM south_region_sales
WHERE sale_date BETWEEN '2023-01-01' AND '2023-06-30'
GROUP BY product_id, product_name

UNION

SELECT product_id, product_name, SUM(sales_amount) AS total_sales, 'East' AS region
FROM east_region_sales
WHERE sale_date BETWEEN '2023-01-01' AND '2023-06-30'
GROUP BY product_id, product_name

UNION

SELECT product_id, product_name, SUM(sales_amount) AS total_sales, 'West' AS region
FROM west_region_sales
WHERE sale_date BETWEEN '2023-01-01' AND '2023-06-30'
GROUP BY product_id, product_name

ORDER BY total_sales DESC, region;

Application 2: Search Functionality Across Multiple Tables

sql
-- Search for a term across customers, products, and suppliers tables
SELECT 'Customer' AS source, customer_id AS id, company_name AS name, city, country
FROM customers
WHERE company_name LIKE '%Computer%' OR contact_name LIKE '%Computer%'

UNION

SELECT 'Product' AS source, product_id AS id, product_name AS name, category_name AS city, supplier_name AS country
FROM products
JOIN categories ON products.category_id = categories.category_id
JOIN suppliers ON products.supplier_id = suppliers.supplier_id
WHERE product_name LIKE '%Computer%' OR description LIKE '%Computer%'

UNION

SELECT 'Supplier' AS source, supplier_id AS id, company_name AS name, city, country
FROM suppliers
WHERE company_name LIKE '%Computer%' OR contact_name LIKE '%Computer%'

ORDER BY source, name;

Best Practices and Tips

1. Column Compatibility

When using UNION, ensure that corresponding columns have compatible data types. MySQL will attempt to perform implicit type conversion when needed, but it's better to explicitly convert data types to avoid unexpected results:

sql
SELECT id, CAST(value AS CHAR) AS value_str FROM table1
UNION
SELECT id, name AS value_str FROM table2;

2. Performance Considerations

  • Use UNION ALL instead of UNION when you don't need to eliminate duplicates, as it's significantly faster
  • Consider creating indexes on columns used in the WHERE clauses of your SELECT statements
  • For large result sets, consider using temporary tables or materialized views

3. Column Naming and ORDER BY

The column names in a UNION result come from the first SELECT statement. You can use column aliases to make the output more readable:

sql
SELECT emp_id AS id, first_name AS name, 'Employee' AS type FROM employees
UNION
SELECT cust_id AS id, company_name AS name, 'Customer' AS type FROM customers
ORDER BY type, name;

Remember that the ORDER BY clause applies to the entire combined result set and should be placed at the end of the query.

Common Errors and Troubleshooting

1. Different Number of Columns

If you get the error "The used SELECT statements have a different number of columns," ensure that each SELECT statement in your UNION has the same number of columns:

sql
-- Wrong
SELECT id, name FROM employees
UNION
SELECT id, name, department FROM departments;

-- Correct
SELECT id, name, NULL AS department FROM employees
UNION
SELECT id, name, department FROM departments;

2. Incompatible Column Data Types

If you encounter issues with data types, use explicit conversion:

sql
SELECT id, CAST(price AS CHAR) AS price_info FROM products
UNION
SELECT id, description AS price_info FROM product_details;

Summary

The MySQL UNION operator is a powerful tool for combining results from multiple SELECT statements into a single result set. It's particularly useful when you need to:

  • Combine data from tables with similar structures
  • Create comprehensive reports from different data sources
  • Implement search functionality across multiple tables
  • Generate normalized output from denormalized data

Remember the key differences between UNION (removes duplicates) and UNION ALL (keeps all rows including duplicates), and ensure that your SELECT statements have the same number of columns with compatible data types.

Exercises

  1. Create two tables: current_employees and former_employees with columns for employee ID, name, position, and department. Use UNION to create a complete employee history report.

  2. Write a query that combines sales data from three different years (stored in separate tables) into a single report showing year-over-year growth.

  3. Implement a search feature that looks for a user-provided term in product names, descriptions, and categories, returning a unified list of matches.

  4. Create a query that combines active and archived customer records, but makes sure to flag duplicate entries where a customer appears in both tables.

Additional Resources

Happy coding!



If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)