Skip to main content

JavaScript Regular Expressions

Regular expressions (regex or regexp) are powerful patterns used to match character combinations in strings. In JavaScript, regular expressions are objects that provide a flexible way to search, extract, and manipulate text. They might look intimidating at first, but once you understand the basics, they become an invaluable tool in your programming toolkit.

What Are Regular Expressions?

A regular expression is a sequence of characters that forms a search pattern. You can use this pattern to match, search, or replace text. In JavaScript, regular expressions are created using forward slashes (/) as delimiters:

javascript
const pattern = /hello/;

This creates a regular expression that looks for the text "hello".

Creating Regular Expressions

There are two ways to create a regular expression in JavaScript:

1. Regular Expression Literal

javascript
const regex = /pattern/flags;

2. RegExp Constructor

javascript
const regex = new RegExp('pattern', 'flags');

Both methods create the same type of object, but the literal notation is preferred when the pattern is known at development time.

Example:

javascript
// Using literal notation
const regexLiteral = /cat/;

// Using constructor
const regexConstructor = new RegExp('cat');

// Both will match the string "cat"

Basic Pattern Matching

Let's start with simple pattern matching using the test() and exec() methods:

The test() Method

The test() method checks if a pattern exists in a string and returns true or false:

javascript
const str = "The cat sat on the mat";
const regex = /cat/;

console.log(regex.test(str)); // Output: true

const regex2 = /dog/;
console.log(regex2.test(str)); // Output: false

The exec() Method

The exec() method returns an array containing details about the match, or null if no match is found:

javascript
const str = "The cat sat on the mat";
const regex = /cat/;

const result = regex.exec(str);
console.log(result);
// Output: ["cat", index: 4, input: "The cat sat on the mat", groups: undefined]

String Methods That Use Regular Expressions

JavaScript strings have several methods that accept regular expressions:

match()

Returns an array of matches:

javascript
const str = "The rain in Spain falls mainly in the plain";
const matches = str.match(/ain/g);

console.log(matches); // Output: ["ain", "ain", "ain"]

Returns the position of the first match:

javascript
const str = "The cat sat on the mat";
const position = str.search(/sat/);

console.log(position); // Output: 8

replace()

Replaces matches with a new string:

javascript
const str = "The cat sat on the mat";
const newStr = str.replace(/cat/g, "dog");

console.log(newStr); // Output: "The dog sat on the mat"

split()

Splits the string at matches:

javascript
const str = "apple,banana,orange";
const fruits = str.split(/,/);

console.log(fruits); // Output: ["apple", "banana", "orange"]

Regular Expression Flags

Flags modify how the pattern matching works:

  • g (global) - Find all matches rather than stopping after the first match
  • i (case-insensitive) - Ignore case when matching
  • m (multi-line) - Treat beginning and end characters (^ and $) as working over multiple lines
  • s (dot-all) - Allow . to match newline characters
  • u (unicode) - Treat the pattern as a sequence of unicode code points
  • y (sticky) - Matches only at the position where the last match ended

Example with flags:

javascript
const str = "The Cat sat on THE mat";

// Case-insensitive search
const regex1 = /cat/i;
console.log(regex1.test(str)); // Output: true

// Find all occurrences of 'the' (case-insensitive)
const regex2 = /the/ig;
console.log(str.match(regex2)); // Output: ["The", "THE"]

Special Characters in Regular Expressions

Regular expressions include special characters that have unique meanings:

Character Classes

  • . - Matches any character except newlines
  • \d - Matches any digit (0-9)
  • \D - Matches any non-digit
  • \w - Matches any word character (alphanumeric + underscore)
  • \W - Matches any non-word character
  • \s - Matches any whitespace character (space, tab, newline)
  • \S - Matches any non-whitespace character

Example:

javascript
const str = "Year: 2023, Month: 05";

// Match all digits
const digits = str.match(/\d/g);
console.log(digits); // Output: ["2", "0", "2", "3", "0", "5"]

// Match all non-digits
const nonDigits = str.match(/\D/g);
console.log(nonDigits); // Output: ["Y", "e", "a", "r", ":", " ", "M", "o", "n", "t", "h", ":", " "]

Character Sets

  • [abc] - Matches any character in the brackets
  • [^abc] - Matches any character not in the brackets
  • [a-z] - Matches any character between a and z

Example:

javascript
const str = "The quick brown fox jumps over the lazy dog";

// Match any vowel
const vowels = str.match(/[aeiou]/g);
console.log(vowels);
// Output: ["e", "u", "i", "o", "o", "u", "o", "e", "e", "a", "o"]

// Match any character except vowels
const nonVowels = str.match(/[^aeiou\s]/g);
console.log(nonVowels);
// Output: ["T", "h", "q", "c", "k", "b", "r", "w", "n", "f", "x", "j", "m", "p", "s", "v", "r", "t", "h", "l", "z", "y", "d", "g"]

Quantifiers

  • * - Matches 0 or more occurrences
  • + - Matches 1 or more occurrences
  • ? - Matches 0 or 1 occurrence
  • {n} - Matches exactly n occurrences
  • {n,} - Matches at least n occurrences
  • {n,m} - Matches between n and m occurrences

Example:

javascript
const str = "hello, helloooo world!";

// Match 'hell' followed by 1 or more 'o's
const pattern = /hello+/g;
console.log(str.match(pattern));
// Output: ["hello", "helloooo"]

// Match 'hel' followed by exactly 2 'l's
const exactPattern = /hell{2}/g;
console.log(str.match(exactPattern));
// Output: ["hello", "helloooo"]

Anchors

  • ^ - Matches the start of a string
  • $ - Matches the end of a string

Example:

javascript
// Match strings that start with 'Hello'
const startsWithHello = /^Hello/;
console.log(startsWithHello.test("Hello world")); // Output: true
console.log(startsWithHello.test("Say Hello")); // Output: false

// Match strings that end with an exclamation mark
const endsWithExclamation = /!$/;
console.log(endsWithExclamation.test("Hello!")); // Output: true
console.log(endsWithExclamation.test("Hello")); // Output: false

Grouping and Capturing

Parentheses () can be used to group parts of a pattern together and capture matched portions:

javascript
const str = "Today is 2023-05-21";
const datePattern = /(\d{4})-(\d{2})-(\d{2})/;

const match = str.match(datePattern);
console.log(match);
/* Output:
[
"2023-05-21", // Full match
"2023", // First captured group (year)
"05", // Second captured group (month)
"21", // Third captured group (day)
index: 9,
input: "Today is 2023-05-21",
groups: undefined
]
*/

// Accessing captured groups
console.log("Year:", match[1]); // Output: Year: 2023
console.log("Month:", match[2]); // Output: Month: 05
console.log("Day:", match[3]); // Output: Day: 21

Real-World Examples

Let's look at some practical applications of regular expressions:

Example 1: Email Validation

javascript
function isValidEmail(email) {
const emailRegex = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/;
return emailRegex.test(email);
}

console.log(isValidEmail("[email protected]")); // Output: true
console.log(isValidEmail("invalid-email")); // Output: false
console.log(isValidEmail("user@example")); // Output: false

Example 2: Password Strength Checker

javascript
function checkPasswordStrength(password) {
// Check if password has at least 8 characters, one uppercase, one lowercase, one number, and one special character
const strongPasswordRegex = /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[!@#$%^&*])[a-zA-Z\d!@#$%^&*]{8,}$/;

if (strongPasswordRegex.test(password)) {
return "Strong password";
} else {
return "Weak password - should be at least 8 characters with uppercase, lowercase, number, and special character";
}
}

console.log(checkPasswordStrength("weakpass"));
// Output: "Weak password - should be at least 8 characters with uppercase, lowercase, number, and special character"

console.log(checkPasswordStrength("StrongP@ss1"));
// Output: "Strong password"

Example 3: URL Extraction

javascript
function extractURLs(text) {
const urlRegex = /https?:\/\/[^\s]+/g;
return text.match(urlRegex) || [];
}

const paragraph = "Visit our website at https://example.com or http://another-site.org for more information.";
console.log(extractURLs(paragraph));
// Output: ["https://example.com", "http://another-site.org"]

Example 4: Formatting Phone Numbers

javascript
function formatPhoneNumber(number) {
// Remove all non-digits first
const cleaned = number.replace(/\D/g, '');

// Format as (XXX) XXX-XXXX
const match = cleaned.match(/^(\d{3})(\d{3})(\d{4})$/);

if (match) {
return `(${match[1]}) ${match[2]}-${match[3]}`;
}

return "Invalid phone number";
}

console.log(formatPhoneNumber("1234567890")); // Output: "(123) 456-7890"
console.log(formatPhoneNumber("123-456-7890")); // Output: "(123) 456-7890"
console.log(formatPhoneNumber("(123) 456-7890")); // Output: "(123) 456-7890"

Common Mistakes and Pitfalls

  1. Greedy vs. Lazy Matching: By default, quantifiers are "greedy" - they match as much as possible. Adding a ? after a quantifier makes it "lazy" - matching as little as possible.
javascript
const str = "<div>Content</div>";

// Greedy (matches from first '<' to last '>')
const greedyRegex = /<.*>/;
console.log(str.match(greedyRegex)[0]); // Output: "<div>Content</div>"

// Lazy (matches minimal content)
const lazyRegex = /<.*?>/;
console.log(str.match(lazyRegex)[0]); // Output: "<div>"
  1. Escaping Special Characters: Special characters need to be escaped with a backslash if you want to match them literally:
javascript
// To match a literal dot
const regex = /example\.com/;

// To match a literal backslash
const filePath = /C:\\Program Files\\/;
  1. Performance Issues: Complex patterns or patterns with excessive backtracking can lead to performance problems:
javascript
// This can be very slow on long strings due to catastrophic backtracking
const badRegex = /^(a+)+$/;

// This is more efficient
const betterRegex = /^a+$/;

Summary

Regular expressions provide a powerful way to work with strings in JavaScript. In this tutorial, we've covered:

  1. Creating regular expressions using literals and constructors
  2. Basic pattern matching methods like test() and exec()
  3. String methods that work with regular expressions: match(), search(), replace(), and split()
  4. Using flags to modify matching behavior
  5. Special characters and character classes
  6. Quantifiers and anchors
  7. Grouping and capturing
  8. Real-world examples showing practical applications

While regular expressions might seem complex at first, they become easier to understand with practice. Start with simple patterns and gradually build your understanding.

Exercises

  1. Write a regular expression that validates a username (3-16 characters, alphanumeric and underscores only).
  2. Create a function that extracts all hashtags from a social media post.
  3. Write a regular expression to validate a date in the format YYYY-MM-DD.
  4. Create a function that censors offensive words in a text (replace them with asterisks).
  5. Write a regular expression that checks if a string is a valid hexadecimal color code (e.g., #FFF or #F3A2B7).

Additional Resources

Regular expressions can be challenging, but they're incredibly powerful once mastered. Keep practicing to improve your pattern-matching skills!



If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)