MySQL DISTINCT
Introduction
When querying a database, you'll often encounter situations where your result set contains duplicate values that you want to filter out. The DISTINCT
keyword in MySQL is designed specifically for this purpose - it allows you to retrieve only unique values from your query results.
Understanding how to use DISTINCT
effectively is an important skill for database manipulation, as it helps you get cleaner, more meaningful data from your queries without writing complex filtering logic.
Basic Syntax
The basic syntax for using the DISTINCT
keyword is:
SELECT DISTINCT column_name(s)
FROM table_name;
This simple addition to your SELECT
statement instructs MySQL to return only unique values for the specified column(s).
Using DISTINCT with a Single Column
Let's start with a simple example. Suppose we have a students
table with information about students and their departments:
CREATE TABLE students (
id INT PRIMARY KEY,
name VARCHAR(50),
department VARCHAR(50),
age INT
);
INSERT INTO students VALUES
(1, 'John', 'Computer Science', 21),
(2, 'Sarah', 'Biology', 22),
(3, 'Mike', 'Computer Science', 20),
(4, 'Lisa', 'Mathematics', 21),
(5, 'David', 'Biology', 23),
(6, 'Anna', 'Computer Science', 22);
If we want to see all the departments without duplicates, we can use:
SELECT DISTINCT department FROM students;
Output:
+-----------------+
| department |
+-----------------+
| Computer Science|
| Biology |
| Mathematics |
+-----------------+
Instead of showing 'Computer Science' three times (as it appears for three different students), DISTINCT
ensures each department is listed only once.
Using DISTINCT with Multiple Columns
The DISTINCT
keyword can also be used with multiple columns to get unique combinations of values across those columns:
SELECT DISTINCT department, age FROM students;
Output:
+-----------------+-----+
| department | age |
+-----------------+-----+
| Computer Science| 21 |
| Biology | 22 |
| Computer Science| 20 |
| Mathematics | 21 |
| Biology | 23 |
| Computer Science| 22 |
+-----------------+-----+
This returns unique combinations of department and age. Even though we have duplicate values in both the department and age columns individually, the combination makes each row unique (except for any exact duplicates in the original data).
DISTINCT with NULL Values
It's important to understand how DISTINCT
treats NULL values. In MySQL, NULL values are considered equal for the purpose of DISTINCT
, meaning all NULL values are grouped together:
-- Adding some NULL values to our table
INSERT INTO students VALUES
(7, 'James', NULL, 23),
(8, 'Emma', NULL, 22);
SELECT DISTINCT department FROM students;
Output:
+-----------------+
| department |
+-----------------+
| Computer Science|
| Biology |
| Mathematics |
| NULL |
+-----------------+
DISTINCT with COUNT Function
One common use case for DISTINCT
is counting unique values using the COUNT()
function:
SELECT COUNT(DISTINCT department) AS unique_departments
FROM students;
Output:
+-------------------+
| unique_departments|
+-------------------+
| 4 |
+-------------------+
This tells us there are 4 unique departments (including NULL) in our students table.
DISTINCT vs. GROUP BY
Sometimes, beginners confuse DISTINCT
with GROUP BY
as both can eliminate duplicates. Let's clarify the difference:
-- Using DISTINCT
SELECT DISTINCT department FROM students;
-- Using GROUP BY
SELECT department FROM students GROUP BY department;
Both queries produce the same result in this simple case. However, GROUP BY
is designed for aggregation operations, while DISTINCT
is simply for removing duplicates. If you need to perform calculations like COUNT, SUM, AVG, etc., GROUP BY
is generally more appropriate.
Practical Examples
Example 1: Finding Unique Customer Countries
For an e-commerce database, you might want to know which countries your customers are from:
CREATE TABLE customers (
id INT PRIMARY KEY,
name VARCHAR(50),
country VARCHAR(50),
purchase_amount DECIMAL(10, 2)
);
INSERT INTO customers VALUES
(1, 'John Smith', 'USA', 120.50),
(2, 'Maria Garcia', 'Mexico', 89.99),
(3, 'Liu Wei', 'China', 45.00),
(4, 'Raj Patel', 'India', 75.25),
(5, 'Emma Johnson', 'USA', 55.75),
(6, 'Sophie Martin', 'France', 99.99),
(7, 'Ali Hassan', 'India', 150.00);
SELECT DISTINCT country FROM customers ORDER BY country;
Output:
+--------+
| country|
+--------+
| China |
| France |
| India |
| Mexico |
| USA |
+--------+
Example 2: Finding Unique Product Categories and Brands
In a product inventory system:
CREATE TABLE products (
id INT PRIMARY KEY,
name VARCHAR(50),
category VARCHAR(50),
brand VARCHAR(50),
price DECIMAL(10, 2)
);
INSERT INTO products VALUES
(1, 'Laptop Pro', 'Electronics', 'TechBrand', 1299.99),
(2, 'Coffee Maker', 'Home Appliances', 'HomeLife', 89.50),
(3, 'Gaming Mouse', 'Electronics', 'GamerGear', 45.99),
(4, 'Desk Chair', 'Furniture', 'ComfortPlus', 199.99),
(5, 'Smartphone X', 'Electronics', 'TechBrand', 899.99),
(6, 'Blender', 'Home Appliances', 'KitchenPro', 79.95);
-- Find unique category-brand combinations
SELECT DISTINCT category, brand FROM products ORDER BY category, brand;
Output:
+-----------------+------------+
| category | brand |
+-----------------+------------+
| Electronics | GamerGear |
| Electronics | TechBrand |
| Furniture | ComfortPlus|
| Home Appliances | HomeLife |
| Home Appliances | KitchenPro |
+-----------------+------------+
Example 3: DISTINCT with WHERE Clause
You can combine DISTINCT
with other SQL clauses like WHERE
to filter before removing duplicates:
-- Find unique departments for students over 21
SELECT DISTINCT department
FROM students
WHERE age > 21;
Output:
+-----------------+
| department |
+-----------------+
| Biology |
| Computer Science|
| NULL |
+-----------------+
Performance Considerations
While DISTINCT
is useful, it comes with performance implications:
-
Sorting overhead: Behind the scenes, MySQL typically needs to sort data to identify duplicates, which can be resource-intensive for large datasets.
-
Alternative approaches: In some cases, using
GROUP BY
or creating an index might be more efficient. -
Use only when needed: Only include
DISTINCT
when you actually need to eliminate duplicates, as it adds processing overhead.
For large tables, you might want to examine the EXPLAIN
output to understand how MySQL is processing your DISTINCT
query:
EXPLAIN SELECT DISTINCT department FROM students;
Common Mistakes
Mistake 1: Using DISTINCT unnecessarily
-- Unnecessary use of DISTINCT with a PRIMARY KEY column
SELECT DISTINCT id FROM students; -- The 'id' column is already unique
Mistake 2: Expecting DISTINCT to work on expressions
In standard MySQL, you cannot use DISTINCT
directly with expressions like:
-- This won't work as expected in some MySQL versions
SELECT DISTINCT CONCAT(department, ' - ', name) FROM students;
-- Instead, use a subquery or derived table:
SELECT DISTINCT combined FROM
(SELECT CONCAT(department, ' - ', name) AS combined FROM students) AS temp;
Summary
The DISTINCT
keyword is a powerful feature in MySQL that allows you to:
- Eliminate duplicate values from query results
- Work with single or multiple columns
- Combine with other SQL features like aggregate functions
- Get a clearer view of your data by removing noise from redundant entries
Remember these key points:
DISTINCT
affects all columns in the SELECT list- For multiple columns, it looks for unique combinations
- It treats NULL values as equals
- There's a performance cost to using
DISTINCT
, so use it judiciously
Practice Exercises
-
Create a table of employees with departments and job titles, then use
DISTINCT
to find unique job titles. -
Use
DISTINCT
with multiple columns to find unique combinations of department and job title. -
Count how many unique departments exist in your employees table.
-
Compare the performance of a query with
DISTINCT
versus the same query usingGROUP BY
. -
Create a query that finds unique salary ranges using
DISTINCT
on a calculated column.
Additional Resources
Master the DISTINCT
clause, and you'll be able to extract cleaner, more focused data from your MySQL databases with ease!
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)