PostgreSQL Wildcards
Introduction
When working with databases, you often need to search for data that follows certain patterns rather than exact matches. For example, you might want to find all customers whose names start with "A" or all products that contain the word "deluxe." This is where wildcards come into play.
Wildcards are special characters that represent one or more unknown characters in a pattern matching operation. PostgreSQL provides several ways to perform pattern matching using wildcards, making your queries more flexible and powerful.
In this tutorial, we'll explore the different wildcard options in PostgreSQL and how to use them effectively in your queries.
LIKE Operator
The most common way to use wildcards in PostgreSQL is with the LIKE
operator. The LIKE
operator is case-sensitive and supports two wildcard characters:
%
(percent sign): Matches any sequence of zero or more characters_
(underscore): Matches exactly one character
Basic Syntax
SELECT column1, column2, ...
FROM table_name
WHERE columnN LIKE pattern;
Examples with the %
Wildcard
Let's assume we have a customers
table with a name
column:
Example 1: Find all customers whose names start with 'A'
SELECT name
FROM customers
WHERE name LIKE 'A%';
Output:
name
-------------
Anna Smith
Alex Johnson
Andrea Davis
Example 2: Find all customers whose names end with 'son'
SELECT name
FROM customers
WHERE name LIKE '%son';
Output:
name
----------------
Alex Johnson
James Peterson
Wilson Thompson
Example 3: Find all customers whose names contain 'am'
SELECT name
FROM customers
WHERE name LIKE '%am%';
Output:
name
-----------------
James Peterson
Samuel Roberts
Pamela Williams
Examples with the _
Wildcard
Example 1: Find all customers whose names have exactly 4 characters
SELECT name
FROM customers
WHERE name LIKE '____';
Output:
name
--------
Anna
Alex
Jane
Example 2: Find all products with a specific pattern
Let's say we have a products
table with product codes that follow a pattern like 'A-123':
SELECT product_code, product_name
FROM products
WHERE product_code LIKE '__-___';
Output:
product_code | product_name
--------------+--------------
AB-123 | Keyboard
CD-456 | Mouse
EF-789 | Headphones
Combining Wildcards
You can combine both wildcards in a single pattern:
SELECT name
FROM customers
WHERE name LIKE '_a%';
This will find all customers whose name has 'a' as the second letter.
Output:
name
-------------
James Peterson
Sarah Johnson
David Wilson
ILIKE Operator (Case-Insensitive)
If you want to perform case-insensitive pattern matching, PostgreSQL provides the ILIKE
operator, which works exactly like LIKE
but ignores case differences.
SELECT name
FROM customers
WHERE name ILIKE 'a%';
This will match names starting with either 'a' or 'A'.
Output:
name
-------------
Anna Smith
Alex Johnson
Andrea Davis
andrew brown
Escaping Wildcards
If you need to match the actual %
or _
characters in your data (rather than using them as wildcards), you can escape them using the backslash character \
.
SELECT product_name
FROM products
WHERE product_name LIKE '50\% discount%';
This will find products whose names start with '50% discount'.
Alternatively, you can specify a custom escape character:
SELECT product_name
FROM products
WHERE product_name LIKE '50#% discount%' ESCAPE '#';
NOT LIKE
To find records that don't match a pattern, use the NOT LIKE
operator:
SELECT name
FROM customers
WHERE name NOT LIKE 'A%';
This will find all customers whose names don't start with 'A'.
SIMILAR TO Operator
PostgreSQL also provides the SIMILAR TO
operator, which follows SQL standard pattern matching. It's more powerful than LIKE
because it supports additional metacharacters from regular expressions:
|
for alternation (either of two alternatives)*
for repetition of the previous item zero or more times+
for repetition of the previous item one or more times?
for repetition of the previous item zero or one time{m}
for repetition of the previous item exactly m times{m,}
for repetition of the previous item m or more times{m,n}
for repetition of the previous item at least m and not more than n times- Parentheses
()
for grouping items into a single logical item
SELECT name
FROM customers
WHERE name SIMILAR TO '(A|B)%';
This will find all customers whose names start with either 'A' or 'B'.
Output:
name
-------------
Anna Smith
Alex Johnson
Bob Williams
Barbara Brown
Regular Expressions (POSIX)
For the most powerful pattern matching, PostgreSQL supports POSIX regular expressions with the ~
(case-sensitive) and ~*
(case-insensitive) operators.
SELECT name
FROM customers
WHERE name ~ '^[AB].*$';
This achieves the same result as the previous example using regular expressions.
You can use !~
and !~*
for negation (records that don't match the pattern).
Performance Considerations
While wildcards are powerful, pattern matching operations can be slower than exact matches, especially on large tables. Consider these tips for better performance:
- Avoid patterns that start with a wildcard (like
'%text'
) as they prevent the use of standard B-tree indexes - For prefix searches (like
'text%'
), regular B-tree indexes can be used efficiently - For complex pattern matching needs, consider using full-text search features instead
- For frequently used patterns, especially with leading wildcards, consider creating specialized indexes like trigram indexes:
CREATE EXTENSION pg_trgm;
CREATE INDEX idx_trgm_name ON customers USING gin (name gin_trgm_ops);
Practical Examples
Example 1: Email Validation
Find all users with a valid email format:
SELECT user_id, email
FROM users
WHERE email SIMILAR TO '%@%.%';
Example 2: Finding Phone Numbers
Find all customers with phone numbers in a specific format:
SELECT customer_id, phone_number
FROM customers
WHERE phone_number SIMILAR TO '[0-9]{3}-[0-9]{3}-[0-9]{4}';
Example 3: Filtering Product Categories
Find products in certain categories:
SELECT product_name
FROM products
WHERE category LIKE 'Electronic%' OR category LIKE 'Computer%';
Example 4: Data Cleaning
Find potentially incorrect data entries:
SELECT product_id, product_name
FROM products
WHERE product_name ~ '[0-9]';
This finds product names containing digits, which might indicate data entry issues.
Summary
Wildcards in PostgreSQL provide powerful pattern matching capabilities that allow you to create flexible queries. The key points to remember are:
- Use
LIKE
with%
(multiple characters) and_
(single character) for basic pattern matching - Use
ILIKE
for case-insensitive matching - Use
SIMILAR TO
for SQL standard pattern matching with additional metacharacters - Use regular expression operators (
~
,~*
) for advanced pattern matching - Consider performance implications, especially with large datasets
Mastering wildcards will significantly enhance your ability to query and filter data in PostgreSQL, making your applications more powerful and user-friendly.
Exercises
-
Create a table with sample customer data and practice finding all customers whose:
- Names start with a vowel
- Email addresses are from gmail.com
- Phone numbers follow the format XXX-XXX-XXXX
-
Write a query to find all products in a database whose:
- Names contain numbers
- Descriptions mention either "premium" or "deluxe"
- SKU codes follow a specific pattern
-
Challenge: Write a query using regular expressions to validate email addresses properly (checking for @ symbol, domain part, etc.)
Additional Resources
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)