JavaScript Regular Expressions
Regular expressions (regex or regexp) are powerful patterns used to match character combinations in strings. In JavaScript, regular expressions are objects that provide a flexible way to search, extract, and manipulate text. They might look intimidating at first, but once you understand the basics, they become an invaluable tool in your programming toolkit.
What Are Regular Expressions?
A regular expression is a sequence of characters that forms a search pattern. You can use this pattern to match, search, or replace text. In JavaScript, regular expressions are created using forward slashes (/
) as delimiters:
const pattern = /hello/;
This creates a regular expression that looks for the text "hello".
Creating Regular Expressions
There are two ways to create a regular expression in JavaScript:
1. Regular Expression Literal
const regex = /pattern/flags;
2. RegExp Constructor
const regex = new RegExp('pattern', 'flags');
Both methods create the same type of object, but the literal notation is preferred when the pattern is known at development time.
Example:
// Using literal notation
const regexLiteral = /cat/;
// Using constructor
const regexConstructor = new RegExp('cat');
// Both will match the string "cat"
Basic Pattern Matching
Let's start with simple pattern matching using the test()
and exec()
methods:
The test()
Method
The test()
method checks if a pattern exists in a string and returns true
or false
:
const str = "The cat sat on the mat";
const regex = /cat/;
console.log(regex.test(str)); // Output: true
const regex2 = /dog/;
console.log(regex2.test(str)); // Output: false
The exec()
Method
The exec()
method returns an array containing details about the match, or null
if no match is found:
const str = "The cat sat on the mat";
const regex = /cat/;
const result = regex.exec(str);
console.log(result);
// Output: ["cat", index: 4, input: "The cat sat on the mat", groups: undefined]
String Methods That Use Regular Expressions
JavaScript strings have several methods that accept regular expressions:
match()
Returns an array of matches:
const str = "The rain in Spain falls mainly in the plain";
const matches = str.match(/ain/g);
console.log(matches); // Output: ["ain", "ain", "ain"]
search()
Returns the position of the first match:
const str = "The cat sat on the mat";
const position = str.search(/sat/);
console.log(position); // Output: 8
replace()
Replaces matches with a new string:
const str = "The cat sat on the mat";
const newStr = str.replace(/cat/g, "dog");
console.log(newStr); // Output: "The dog sat on the mat"
split()
Splits the string at matches:
const str = "apple,banana,orange";
const fruits = str.split(/,/);
console.log(fruits); // Output: ["apple", "banana", "orange"]
Regular Expression Flags
Flags modify how the pattern matching works:
g
(global) - Find all matches rather than stopping after the first matchi
(case-insensitive) - Ignore case when matchingm
(multi-line) - Treat beginning and end characters (^ and $) as working over multiple liness
(dot-all) - Allow.
to match newline charactersu
(unicode) - Treat the pattern as a sequence of unicode code pointsy
(sticky) - Matches only at the position where the last match ended
Example with flags:
const str = "The Cat sat on THE mat";
// Case-insensitive search
const regex1 = /cat/i;
console.log(regex1.test(str)); // Output: true
// Find all occurrences of 'the' (case-insensitive)
const regex2 = /the/ig;
console.log(str.match(regex2)); // Output: ["The", "THE"]
Special Characters in Regular Expressions
Regular expressions include special characters that have unique meanings:
Character Classes
.
- Matches any character except newlines\d
- Matches any digit (0-9)\D
- Matches any non-digit\w
- Matches any word character (alphanumeric + underscore)\W
- Matches any non-word character\s
- Matches any whitespace character (space, tab, newline)\S
- Matches any non-whitespace character
Example:
const str = "Year: 2023, Month: 05";
// Match all digits
const digits = str.match(/\d/g);
console.log(digits); // Output: ["2", "0", "2", "3", "0", "5"]
// Match all non-digits
const nonDigits = str.match(/\D/g);
console.log(nonDigits); // Output: ["Y", "e", "a", "r", ":", " ", "M", "o", "n", "t", "h", ":", " "]
Character Sets
[abc]
- Matches any character in the brackets[^abc]
- Matches any character not in the brackets[a-z]
- Matches any character between a and z
Example:
const str = "The quick brown fox jumps over the lazy dog";
// Match any vowel
const vowels = str.match(/[aeiou]/g);
console.log(vowels);
// Output: ["e", "u", "i", "o", "o", "u", "o", "e", "e", "a", "o"]
// Match any character except vowels
const nonVowels = str.match(/[^aeiou\s]/g);
console.log(nonVowels);
// Output: ["T", "h", "q", "c", "k", "b", "r", "w", "n", "f", "x", "j", "m", "p", "s", "v", "r", "t", "h", "l", "z", "y", "d", "g"]
Quantifiers
*
- Matches 0 or more occurrences+
- Matches 1 or more occurrences?
- Matches 0 or 1 occurrence{n}
- Matches exactly n occurrences{n,}
- Matches at least n occurrences{n,m}
- Matches between n and m occurrences
Example:
const str = "hello, helloooo world!";
// Match 'hell' followed by 1 or more 'o's
const pattern = /hello+/g;
console.log(str.match(pattern));
// Output: ["hello", "helloooo"]
// Match 'hel' followed by exactly 2 'l's
const exactPattern = /hell{2}/g;
console.log(str.match(exactPattern));
// Output: ["hello", "helloooo"]
Anchors
^
- Matches the start of a string$
- Matches the end of a string
Example:
// Match strings that start with 'Hello'
const startsWithHello = /^Hello/;
console.log(startsWithHello.test("Hello world")); // Output: true
console.log(startsWithHello.test("Say Hello")); // Output: false
// Match strings that end with an exclamation mark
const endsWithExclamation = /!$/;
console.log(endsWithExclamation.test("Hello!")); // Output: true
console.log(endsWithExclamation.test("Hello")); // Output: false
Grouping and Capturing
Parentheses ()
can be used to group parts of a pattern together and capture matched portions:
const str = "Today is 2023-05-21";
const datePattern = /(\d{4})-(\d{2})-(\d{2})/;
const match = str.match(datePattern);
console.log(match);
/* Output:
[
"2023-05-21", // Full match
"2023", // First captured group (year)
"05", // Second captured group (month)
"21", // Third captured group (day)
index: 9,
input: "Today is 2023-05-21",
groups: undefined
]
*/
// Accessing captured groups
console.log("Year:", match[1]); // Output: Year: 2023
console.log("Month:", match[2]); // Output: Month: 05
console.log("Day:", match[3]); // Output: Day: 21
Real-World Examples
Let's look at some practical applications of regular expressions:
Example 1: Email Validation
function isValidEmail(email) {
const emailRegex = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/;
return emailRegex.test(email);
}
console.log(isValidEmail("[email protected]")); // Output: true
console.log(isValidEmail("invalid-email")); // Output: false
console.log(isValidEmail("user@example")); // Output: false
Example 2: Password Strength Checker
function checkPasswordStrength(password) {
// Check if password has at least 8 characters, one uppercase, one lowercase, one number, and one special character
const strongPasswordRegex = /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[!@#$%^&*])[a-zA-Z\d!@#$%^&*]{8,}$/;
if (strongPasswordRegex.test(password)) {
return "Strong password";
} else {
return "Weak password - should be at least 8 characters with uppercase, lowercase, number, and special character";
}
}
console.log(checkPasswordStrength("weakpass"));
// Output: "Weak password - should be at least 8 characters with uppercase, lowercase, number, and special character"
console.log(checkPasswordStrength("StrongP@ss1"));
// Output: "Strong password"
Example 3: URL Extraction
function extractURLs(text) {
const urlRegex = /https?:\/\/[^\s]+/g;
return text.match(urlRegex) || [];
}
const paragraph = "Visit our website at https://example.com or http://another-site.org for more information.";
console.log(extractURLs(paragraph));
// Output: ["https://example.com", "http://another-site.org"]
Example 4: Formatting Phone Numbers
function formatPhoneNumber(number) {
// Remove all non-digits first
const cleaned = number.replace(/\D/g, '');
// Format as (XXX) XXX-XXXX
const match = cleaned.match(/^(\d{3})(\d{3})(\d{4})$/);
if (match) {
return `(${match[1]}) ${match[2]}-${match[3]}`;
}
return "Invalid phone number";
}
console.log(formatPhoneNumber("1234567890")); // Output: "(123) 456-7890"
console.log(formatPhoneNumber("123-456-7890")); // Output: "(123) 456-7890"
console.log(formatPhoneNumber("(123) 456-7890")); // Output: "(123) 456-7890"
Common Mistakes and Pitfalls
- Greedy vs. Lazy Matching: By default, quantifiers are "greedy" - they match as much as possible. Adding a
?
after a quantifier makes it "lazy" - matching as little as possible.
const str = "<div>Content</div>";
// Greedy (matches from first '<' to last '>')
const greedyRegex = /<.*>/;
console.log(str.match(greedyRegex)[0]); // Output: "<div>Content</div>"
// Lazy (matches minimal content)
const lazyRegex = /<.*?>/;
console.log(str.match(lazyRegex)[0]); // Output: "<div>"
- Escaping Special Characters: Special characters need to be escaped with a backslash if you want to match them literally:
// To match a literal dot
const regex = /example\.com/;
// To match a literal backslash
const filePath = /C:\\Program Files\\/;
- Performance Issues: Complex patterns or patterns with excessive backtracking can lead to performance problems:
// This can be very slow on long strings due to catastrophic backtracking
const badRegex = /^(a+)+$/;
// This is more efficient
const betterRegex = /^a+$/;
Summary
Regular expressions provide a powerful way to work with strings in JavaScript. In this tutorial, we've covered:
- Creating regular expressions using literals and constructors
- Basic pattern matching methods like
test()
andexec()
- String methods that work with regular expressions:
match()
,search()
,replace()
, andsplit()
- Using flags to modify matching behavior
- Special characters and character classes
- Quantifiers and anchors
- Grouping and capturing
- Real-world examples showing practical applications
While regular expressions might seem complex at first, they become easier to understand with practice. Start with simple patterns and gradually build your understanding.
Exercises
- Write a regular expression that validates a username (3-16 characters, alphanumeric and underscores only).
- Create a function that extracts all hashtags from a social media post.
- Write a regular expression to validate a date in the format YYYY-MM-DD.
- Create a function that censors offensive words in a text (replace them with asterisks).
- Write a regular expression that checks if a string is a valid hexadecimal color code (e.g., #FFF or #F3A2B7).
Additional Resources
- MDN Web Docs: Regular Expressions
- RegExr - An online tool to learn, build, & test regular expressions
- Regular Expressions 101 - Another great testing and debugging tool
- JavaScript RegExp Reference - W3Schools reference guide
Regular expressions can be challenging, but they're incredibly powerful once mastered. Keep practicing to improve your pattern-matching skills!
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)