Skip to main content

MySQL DISTINCT

Introduction

When querying a database, you'll often encounter situations where your result set contains duplicate values that you want to filter out. The DISTINCT keyword in MySQL is designed specifically for this purpose - it allows you to retrieve only unique values from your query results.

Understanding how to use DISTINCT effectively is an important skill for database manipulation, as it helps you get cleaner, more meaningful data from your queries without writing complex filtering logic.

Basic Syntax

The basic syntax for using the DISTINCT keyword is:

sql
SELECT DISTINCT column_name(s)
FROM table_name;

This simple addition to your SELECT statement instructs MySQL to return only unique values for the specified column(s).

Using DISTINCT with a Single Column

Let's start with a simple example. Suppose we have a students table with information about students and their departments:

sql
CREATE TABLE students (
id INT PRIMARY KEY,
name VARCHAR(50),
department VARCHAR(50),
age INT
);

INSERT INTO students VALUES
(1, 'John', 'Computer Science', 21),
(2, 'Sarah', 'Biology', 22),
(3, 'Mike', 'Computer Science', 20),
(4, 'Lisa', 'Mathematics', 21),
(5, 'David', 'Biology', 23),
(6, 'Anna', 'Computer Science', 22);

If we want to see all the departments without duplicates, we can use:

sql
SELECT DISTINCT department FROM students;

Output:

+-----------------+
| department |
+-----------------+
| Computer Science|
| Biology |
| Mathematics |
+-----------------+

Instead of showing 'Computer Science' three times (as it appears for three different students), DISTINCT ensures each department is listed only once.

Using DISTINCT with Multiple Columns

The DISTINCT keyword can also be used with multiple columns to get unique combinations of values across those columns:

sql
SELECT DISTINCT department, age FROM students;

Output:

+-----------------+-----+
| department | age |
+-----------------+-----+
| Computer Science| 21 |
| Biology | 22 |
| Computer Science| 20 |
| Mathematics | 21 |
| Biology | 23 |
| Computer Science| 22 |
+-----------------+-----+

This returns unique combinations of department and age. Even though we have duplicate values in both the department and age columns individually, the combination makes each row unique (except for any exact duplicates in the original data).

DISTINCT with NULL Values

It's important to understand how DISTINCT treats NULL values. In MySQL, NULL values are considered equal for the purpose of DISTINCT, meaning all NULL values are grouped together:

sql
-- Adding some NULL values to our table
INSERT INTO students VALUES
(7, 'James', NULL, 23),
(8, 'Emma', NULL, 22);

SELECT DISTINCT department FROM students;

Output:

+-----------------+
| department |
+-----------------+
| Computer Science|
| Biology |
| Mathematics |
| NULL |
+-----------------+

DISTINCT with COUNT Function

One common use case for DISTINCT is counting unique values using the COUNT() function:

sql
SELECT COUNT(DISTINCT department) AS unique_departments 
FROM students;

Output:

+-------------------+
| unique_departments|
+-------------------+
| 4 |
+-------------------+

This tells us there are 4 unique departments (including NULL) in our students table.

DISTINCT vs. GROUP BY

Sometimes, beginners confuse DISTINCT with GROUP BY as both can eliminate duplicates. Let's clarify the difference:

sql
-- Using DISTINCT
SELECT DISTINCT department FROM students;

-- Using GROUP BY
SELECT department FROM students GROUP BY department;

Both queries produce the same result in this simple case. However, GROUP BY is designed for aggregation operations, while DISTINCT is simply for removing duplicates. If you need to perform calculations like COUNT, SUM, AVG, etc., GROUP BY is generally more appropriate.

Practical Examples

Example 1: Finding Unique Customer Countries

For an e-commerce database, you might want to know which countries your customers are from:

sql
CREATE TABLE customers (
id INT PRIMARY KEY,
name VARCHAR(50),
country VARCHAR(50),
purchase_amount DECIMAL(10, 2)
);

INSERT INTO customers VALUES
(1, 'John Smith', 'USA', 120.50),
(2, 'Maria Garcia', 'Mexico', 89.99),
(3, 'Liu Wei', 'China', 45.00),
(4, 'Raj Patel', 'India', 75.25),
(5, 'Emma Johnson', 'USA', 55.75),
(6, 'Sophie Martin', 'France', 99.99),
(7, 'Ali Hassan', 'India', 150.00);

SELECT DISTINCT country FROM customers ORDER BY country;

Output:

+--------+
| country|
+--------+
| China |
| France |
| India |
| Mexico |
| USA |
+--------+

Example 2: Finding Unique Product Categories and Brands

In a product inventory system:

sql
CREATE TABLE products (
id INT PRIMARY KEY,
name VARCHAR(50),
category VARCHAR(50),
brand VARCHAR(50),
price DECIMAL(10, 2)
);

INSERT INTO products VALUES
(1, 'Laptop Pro', 'Electronics', 'TechBrand', 1299.99),
(2, 'Coffee Maker', 'Home Appliances', 'HomeLife', 89.50),
(3, 'Gaming Mouse', 'Electronics', 'GamerGear', 45.99),
(4, 'Desk Chair', 'Furniture', 'ComfortPlus', 199.99),
(5, 'Smartphone X', 'Electronics', 'TechBrand', 899.99),
(6, 'Blender', 'Home Appliances', 'KitchenPro', 79.95);

-- Find unique category-brand combinations
SELECT DISTINCT category, brand FROM products ORDER BY category, brand;

Output:

+-----------------+------------+
| category | brand |
+-----------------+------------+
| Electronics | GamerGear |
| Electronics | TechBrand |
| Furniture | ComfortPlus|
| Home Appliances | HomeLife |
| Home Appliances | KitchenPro |
+-----------------+------------+

Example 3: DISTINCT with WHERE Clause

You can combine DISTINCT with other SQL clauses like WHERE to filter before removing duplicates:

sql
-- Find unique departments for students over 21
SELECT DISTINCT department
FROM students
WHERE age > 21;

Output:

+-----------------+
| department |
+-----------------+
| Biology |
| Computer Science|
| NULL |
+-----------------+

Performance Considerations

While DISTINCT is useful, it comes with performance implications:

  1. Sorting overhead: Behind the scenes, MySQL typically needs to sort data to identify duplicates, which can be resource-intensive for large datasets.

  2. Alternative approaches: In some cases, using GROUP BY or creating an index might be more efficient.

  3. Use only when needed: Only include DISTINCT when you actually need to eliminate duplicates, as it adds processing overhead.

For large tables, you might want to examine the EXPLAIN output to understand how MySQL is processing your DISTINCT query:

sql
EXPLAIN SELECT DISTINCT department FROM students;

Common Mistakes

Mistake 1: Using DISTINCT unnecessarily

sql
-- Unnecessary use of DISTINCT with a PRIMARY KEY column
SELECT DISTINCT id FROM students; -- The 'id' column is already unique

Mistake 2: Expecting DISTINCT to work on expressions

In standard MySQL, you cannot use DISTINCT directly with expressions like:

sql
-- This won't work as expected in some MySQL versions
SELECT DISTINCT CONCAT(department, ' - ', name) FROM students;

-- Instead, use a subquery or derived table:
SELECT DISTINCT combined FROM
(SELECT CONCAT(department, ' - ', name) AS combined FROM students) AS temp;

Summary

The DISTINCT keyword is a powerful feature in MySQL that allows you to:

  • Eliminate duplicate values from query results
  • Work with single or multiple columns
  • Combine with other SQL features like aggregate functions
  • Get a clearer view of your data by removing noise from redundant entries

Remember these key points:

  • DISTINCT affects all columns in the SELECT list
  • For multiple columns, it looks for unique combinations
  • It treats NULL values as equals
  • There's a performance cost to using DISTINCT, so use it judiciously

Practice Exercises

  1. Create a table of employees with departments and job titles, then use DISTINCT to find unique job titles.

  2. Use DISTINCT with multiple columns to find unique combinations of department and job title.

  3. Count how many unique departments exist in your employees table.

  4. Compare the performance of a query with DISTINCT versus the same query using GROUP BY.

  5. Create a query that finds unique salary ranges using DISTINCT on a calculated column.

Additional Resources

Master the DISTINCT clause, and you'll be able to extract cleaner, more focused data from your MySQL databases with ease!



If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)