xRegex
Links
Regex is a way to search through the text to validate text, find and replace, etc.
Format
Start and end with a /. Everything within the forward slash is the regex:
/cat/g
Expression flags
The g in /cat/g is an expression flag. An expression flag defines the scope of the regex search. The expression flags are listed below:
g: globali: case insensitivem: multilines: single lineu: unicodey: sticky
Generally, you will use the g flag the most.
By default, a regex pattern returns only the first result that it finds. To return all, use the
/gexpression flag.
Character sets
[bcf]at matches the following:
- bat
- cat
- fat
Ranges
Ranges match any single character, digit, or symbol within the range. [a-z]at matches any word that begins with a lowercase letter and ends with at.
| Range type | Example |
|---|---|
| Partial | [a-f] or [g-p] |
| Capitalized | [A-Z] |
| Digit | [0-9] |
| Symbol | [#$%&@] |
| Mixed | [a-zA-Z0-9] |
Repeating characters
Specify the number of repeating characters with the {val} syntax:
| Expression | Match |
|---|---|
a{5} | aaaaa |
[a-z]{4} | any four-letter lowercase word |
[a-z]{6,} | any lowercase word with 6 or more letters |
[a-z]{8,11} | any lowercase word with between 8 or 11 letters (inclusive) |
[0-9]{11} | 11-digit number |
Metacharacters
Write compact regex with metacharacters:
| Expression | Match |
|---|---|
\d | any digit, equal to [0-9] |
\w | any word character, such as a letter or digit. |
\W | any character that is not a word character or digit. |
\s | any whitespace |
\S | anything other than a whitespace character. |
\t | tab character |
Combine these with repeating characters:
\w{5}: five-letter word or number\d{11}
Special characters
| Character | Description | Example |
|---|---|---|
+ | One or more of the previous characters. | c+at matches cat or ccccat |
? | Zero or one of the previous characters. | c?at matches cat or at |
* | Zero or more of the previous characters. | c*at matches at or cat or cccat |
\ | Escape character. | \d* matches d* |
[^] | Negate notation. Do not match the characters after the ^ within the braces. | b[^a]ld matches bold but not bald |
. | Match any digit, letter, or symbol except newline. | .{8} matches any eight-character token. |
For example:
.+matches one or more unlimited number of characters.[a-z]+matches all lowercase words.
Groups
Groups apply pattern matching to a section of the expression. book(ing)? matches book and booking.
Alternate characters
Use the pipe (|) character to match one or the other option. For example:
bat|bit
You can match the same values by enclosing the options in parentheses:
b(a|i)t
Starting and ending characters
REVISIT THIS
| Character | Description | Example |
|---|---|---|
^ | Matches pattern at the start of the string. Place this at the beginning of the pattern. | |
$ | Matches pattern at the end of the string. Place this at the end of the pattern. |
Quantifiers
Quantifiers define how many
+: match one or more of the preceding token.?: optional. Optionally, you want to match the preceding token.*: match zero or more of the preceding token. Wildcard..: match anything except a newline.\.: search for a period.{min,max}: match any characters between min and max. So/\w{4,5}\/gmatches any 4 or 5 consequtive word characters.[bc]at: matches bat or cat.[a-zA-Z]: any word that ends inatand starts with an uppercase or lowercase letter. Works with numbers too ([0-9])(): groups
Lookaheads
Read this Sitepoint article.