What is the difference between greedy and lazy matching?

Greedy quantifiers (*, +, ?) match as many characters as possible, while lazy versions (*?, +?, ??) match as few as possible. For example, given ' hello world ', the pattern .* (greedy) matches the entire string including both tags, while .*? (lazy) matches only ' hello '. Use lazy matching when you want the shortest possible match.

How do I test regex patterns?

Online tools like regex101.com and regexr.com let you write, test, and debug patterns in real time with explanations of each component. Most programming languages also have built-in regex testing: JavaScript's string.match() and RegExp.test(), Python's re module, and Java's Pattern class. Always test with both matching and non-matching inputs to verify your pattern works correctly.

Why should I avoid using regex for HTML parsing?

HTML is a context-free language with nested structures that regex (a regular language tool) cannot reliably parse. Regex cannot handle arbitrary nesting depth, optional attributes, self-closing tags, comments, and the many variations of valid HTML. Use a proper HTML parser (like BeautifulSoup for Python, Cheerio for Node.js, or DOMParser in browsers) for HTML manipulation. Regex is acceptable for simple, well-defined patterns in HTML but not for general parsing.

Regex Cheat Sheet: Essential Patterns Every Developer Needs

Regex Basics: Characters and Quantifiers

Regular expressions (regex) are patterns used to match character combinations in strings. The fundamentals: a dot (.) matches any single character except newline. \d matches any digit (0-9), \w matches any word character (letters, digits, underscore), and \s matches any whitespace. Quantifiers control how many times a pattern matches: * means zero or more, + means one or more, ? means zero or one, and {n} means exactly n times. {n,m} means between n and m times. For example, \d{3}-\d{4} matches patterns like 555-1234 (three digits, a hyphen, four digits). The caret (^) anchors to the start of a string and the dollar sign ($) anchors to the end.

Character Classes and Groups

Square brackets define character classes: [aeiou] matches any vowel, [0-9] matches any digit, [A-Za-z] matches any letter. A caret inside brackets negates the class: [^0-9] matches any non-digit character. Parentheses create capture groups: (\d{3})-(\d{4}) captures the area code and number separately. The pipe symbol (|) acts as OR: cat|dog matches either word. Non-capturing groups (?:pattern) group without capturing, which is useful for applying quantifiers to complex patterns without storing the match. Backreferences like \1 refer to the first captured group, enabling patterns like (\w+)\s\1 to find repeated words.

Common Real-World Patterns

Email validation (basic): ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$. URL matching: https?://[\w.-]+(?:/[\w._~:/?#@!$&'()*+,;=-]*)? — matches HTTP and HTTPS URLs. Phone numbers (US): ^$?\d{3}$?[-.\s]?\d{3}[-.\s]?\d{4}$ — handles formats like (555) 123-4567, 555-123-4567, and 5551234567. IPv4 address: ^(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)$. Date (YYYY-MM-DD): ^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$. These patterns handle common formats but may need adjustment for edge cases.

Recommended Resources

Try a Cloud IDE

Code from anywhere with a powerful cloud development environment.

Start Free

Sponsored · We may earn a commission at no cost to you

Lookaheads, Lookbehinds, and Advanced Features

Lookaheads assert that a pattern follows (positive: (?=pattern)) or does not follow (negative: (?!pattern)) the current position without consuming characters. For example, \d+(?= dollars) matches digits followed by ' dollars' but does not include ' dollars' in the match. Lookbehinds work the same way in reverse: (?<=\$)\d+ matches digits preceded by a dollar sign. These are essential for password validation: (?=.*[A-Z])(?=.*[a-z])(?=.*\d)(?=.*[@$!%*?&]).{8,} requires at least one uppercase, one lowercase, one digit, one special character, and minimum 8 characters. Lazy quantifiers (*? and +?) match as few characters as possible, unlike their greedy counterparts.

Regex Cheat Sheet: Essential Patterns Every Developer Needs

Regex Basics: Characters and Quantifiers

Character Classes and Groups

Common Real-World Patterns

Lookaheads, Lookbehinds, and Advanced Features

Related Free Tools

JSON Formatter

Password Generator

Related Articles

JSON Formatting Best Practices: Write Clean, Valid JSON

How to Generate Secure Passwords: Best Practices for 2026

Guide to QR Codes: How They Work, Types & Best Uses

Frequently Asked Questions

What is the difference between greedy and lazy matching?

How do I test regex patterns?

Why should I avoid using regex for HTML parsing?