What is a regular expression?
A regular expression (regex) is a pattern that describes a set of strings. You write a pattern once, and a regex engine tests it against text to find matches, validate formats, or extract parts.
Regex is built into almost every language: JavaScript's /pattern/flags, Python's re.match(), Java's Pattern.compile(). The same core syntax works everywhere with minor differences.
Anatomy of a pattern
The engine scans left to right, trying to match each part of the pattern against the input at each position. When all parts match in sequence, it reports a match. The cheat sheet below covers every piece of syntax you need.
Syntax reference
Character classes
.Any character except newline\dDigit: [0-9]\DNon-digit: [^0-9]\wWord character: [a-zA-Z0-9_]\WNon-word character\sWhitespace: space, tab, newline\SNon-whitespace[abc]Any one of: a, b, or c[^abc]Any character except a, b, or c[a-z]Any lowercase letter a through z[a-zA-Z0-9]Any letter or digitAnchors
^Start of string (or line with m flag)$End of string (or line with m flag)\bWord boundary\BNon-word boundary\AStart of string (Python/Ruby, not JS)\ZEnd of string (Python/Ruby, not JS)Quantifiers
*Zero or more (greedy)+One or more (greedy)?Zero or one (greedy){3}Exactly 3 times{3,}Three or more times{3,6}Between 3 and 6 times*?Zero or more (lazy: as few as possible)+?One or more (lazy)??Zero or one (lazy)Groups and capturing
(abc)Capture group: captures "abc"(?:abc)Non-capturing group: matches but does not capture(?<name>abc)Named capture group\1Backreference to capture group 1\k<name>Backreference to named groupa|bAlternation: matches a or bLookahead and lookbehind
(?=abc)Positive lookahead: followed by "abc"(?!abc)Negative lookahead: not followed by "abc"(?<=abc)Positive lookbehind: preceded by "abc"(?<!abc)Negative lookbehind: not preceded by "abc"Escaping
\.Literal dot (escapes the special meaning of .)\*Literal asterisk\(Literal opening parenthesis\\Literal backslash\nNewline character\tTab characterFlags
Flags modify how the engine interprets the pattern. In JavaScript, append them after the closing slash: /pattern/gi.
| Flag | Name | Effect |
|---|---|---|
| g | Global | Find all matches, not just the first |
| i | Case-insensitive | Treat uppercase and lowercase as equal |
| m | Multiline | Make ^ and $ match start/end of each line |
| s | Dotall | Make . match newlines too |
| u | Unicode | Enable full Unicode matching (recommended) |
| y | Sticky | Match only at the current position (JS) |
Common patterns
Ready-to-use patterns for the most common validation tasks. Copy and test them in the Regex Tester before using in production.
^[\w.-]+@[\w.-]+\.[a-zA-Z]{2,}$URL (http/https)
https?:\/\/[\w.-]+(?:\.[\w.]+)+[\w.,@?^=%&:/~+#-]*IPv4 address
^(?:\d{1,3}\.){3}\d{1,3}$Date (YYYY-MM-DD)
^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$Time (HH:MM)
^([01]\d|2[0-3]):[0-5]\d$Hex color
^#([0-9a-fA-F]{3}|[0-9a-fA-F]{6})$UUID
^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$Slug (URL-safe)
^[a-z0-9]+(?:-[a-z0-9]+)*$Integer
^-?\d+$Decimal number
^-?\d+(\.\d+)?$Frequently asked questions
What is the difference between greedy and lazy quantifiers?
Greedy quantifiers (*, +, ?) match as many characters as possible. Lazy quantifiers (*?, +?, ??) match as few characters as possible. For example, on "<b>bold</b> and <i>italic</i>", the greedy <.*> matches the entire string, while the lazy <.*?> matches just "<b>".
What is the difference between a capture group and a non-capturing group?
Both (abc) and (?:abc) match the same text, but (abc) stores the match so you can reference it later (in a replacement string as $1, or via match[1] in code). (?:abc) just groups for structure or alternation without the memory overhead. Use non-capturing groups when you do not need the value.
Why does my dot (.) not match newlines?
By default, . matches any character except a newline. Enable the s (dotall) flag to make . match newlines too. In JavaScript: /pattern/s. Alternatively, use [\s\S] as a workaround in environments that do not support the s flag.
What is a word boundary (\b)?
\b matches the position between a word character (\w) and a non-word character. For example, \bcat\b matches "cat" in "the cat sat" but not in "concatenate". It is a zero-width assertion: it matches a position, not a character.
Is regex the same in JavaScript, Python, and other languages?
The core syntax is shared, but there are differences. Python uses (?P<name>) for named groups; JavaScript uses (?<name>). JavaScript does not support atomic groups or possessive quantifiers. Python's re.VERBOSE mode allows comments in patterns. Always test your pattern in the language you will actually use.