Regular expressions are special strings that represent a search pattern.
Also known as "regex" or "regexp", they help programmers match, search, and replace text.
Regular expressions can appear cryptic because a few characters have special meaning.
The goal is to combine the symbols and text into a pattern that matches what we want, but only what you want.
This guide will cover the characters, a few shortcuts, and the common uses for writing regular expressions.
Regular expressions are used in programming languages to match parts of strings. We create patterns to help us do that matching.
If we want to find the word "the" in the string "The dog chased the cat", we could use the following regular expression: /the/.
Notice that quote marks are not required within the regular expression.
JavaScript has multiple ways to use regexes. One way to test a regex is using the .test() method.
The .test() method takes the regex, applies it to a string (which is placed inside the parentheses),
and returns true or false if your pattern finds something or not.
Example
let testStr = "freeCodeCamp";
let testRegex = /Code/;
testRegex.test(testStr); // Returns true
In Using the Test Method [2][1], we searched for the word "Code" using the regular expression /Code/.
That regex searched for a literal match of the string "Code".
Here's another example searching for a literal match of the string "Kevin"
let testStr = "Hello, my name is Kevin.";
let testRegex = /Kevin/;
testRegex.test(testStr); // Returns true
Any other forms of "Kevin" will not match.
For example, the regex /Kevin/ will not match "kevin" or "KEVIN".
let wrongRegex = /kevin/;
wrongRegex.test(testStr); // Returns false
Using regexes like /coding/, we can look for the pattern "coding" in another string.
This is powerful to search single strings, but it's limited to only one pattern.
We can search for multiple patterns using the alternation or OR operator: " | ".
This operator matches patterns either before or after it.
For example, if we wanted to match "yes" or "no", the regex we want is /yes|no/.
We can also search for more than just two patterns.
We can do this by adding more patterns with more OR operators separating them, like /yes|no|maybe/.
Example
let petString = "James has a pet cat.";
let petRegex = /dog|cat|bird|fish/;
let result = petRegex.test(petString);
console.log(result); // Returns true
We have only been checking if a pattern exists or not, within a string.
We can also extract the actual matches we found with the .match() method.
To use the .match() method, apply the method on a string and pass in the regex inside the parentheses.
Example
"Hello, World!".match(/Hello/); // Returns ["Hello"]
let ourStr = "Regular expressions";
let ourRegex = /expressions/;
ourStr.match(ourRegex); // Returns ["expressions"]
let extractStr = "Extract the word 'coding' from this string.";
let codingRegex = /coding/;
let result = extractStr.match(codingRegex);
console.log(result); // Returns ["coding"]
Above we've looked at regexes to do literal matches of strings. But sometimes, we might want to also match case differences.
Case (or sometimes letter case) is the difference between UPPERCASE letters and lowercase letters.
Examples of UPPERCASE are "A", "B", and "C".
Examples of lowercase are "a", "b", and "c".
We can match both cases using what is called a flag.
There are other flags but here we'll focus on the flag that ignores case - the i flag.
We can use it by appending it to the regex.
An example of using this flag is /ignorecase/i.
This regex can match the strings "ignorecase", "igNoreCase", and "IgnoreCase".
Example
let myString = "freeCodeCamp";
let fccRegex = /fReecOdEcAmp/i/;
let result = fccRegex.test(myString);
console.log(result); // Returns true
To extract or search a pattern once.
Example
let testStr = "Repeat, Repeat, Repeat";
let ourRegex = /Repeat/;
testStr.match(ourRegex); // Returns ["Repeat"]
To search or extract a pattern more than once, we can use the g flag.
Example
let repeatRegex = /Repeat/g;
testStr.match(repeatRegex); // Returns ["Repeat", "Repeat", "Repeat"]
Note
We can have multiple flags on our regex like /search/gi
Example
let twinkleStar = "Twinkle, twinkle, little star";
let starRegex = /TwInKlE/gi; // Check on both, UPPERCASE/lowercase and Repeats.
let result = twinkleStar.match(starRegex);
console.log(result); // Returns "Twinkle, twinkle"
Sometimes we won't (or don't need to) know the exact characters in our patterns.
Thinking of all words that match, say, a misspelling would take a long time.
Luckily, we can save time using the wildcard character: " . "
The wildcard character " . " will match any one character. The wildcard is also called dot and period.
We can use the wildcard character just like any other character in the regex.
For example, if we wanted to match "hug", "huh", "hut", and "hum", we can use the regex /hu./ to match all four words.
Example
let humStr = "I'll hum a song";
let hugStr = "Bear hug";
let huRegex = /hu./;
humStr.match(huRegex); // Returns ["hum"]
hugStr.match(huRegex); // Returns ["hug"]
let exampleStr = "Let's have fun with regular expressions!";
let unRegex = /.un/;
let result = unRegex.test(exampleStr);
console.log(result); // Returns ["fun"]
We know how to match literal patterns /literal/ and wildcard character /./ .
Those are the extremes of regular expressions, where one finds exact matches and the other matches everything.
There are options that are a balance between the two extremes.
We can search for a literal pattern with some flexibility with character classes.
Character classes allow us to define a group of characters we wish to match by placing them inside square " [ ] ") brackets.
For example, we want to match "dig", "dog", and "dug" but not "dag". We can create the regex /d[iou]g/ to do this.
The [iou] is the character class that will only match the characters "i", "o", or "u".
Example
let digStr = "dig";
let dogStr = "dog";
let dugStr = "dug";
let dagStr = "dag";
let dgRegex = /d[iou]g/;
digStr.match(dgRegex); // Returns ["dig"]
dogStr.match(dgRegex); // Returns ["dog"]
dugStr.match(dgRegex); // Returns ["dug"]
dagStr.match(dgRegex); // Returns null
let quoteSample = "Beware of bugs in the above code; I have only proved it correct, not tried it.";
let vowelRegex = /[aeiou]/gi; // same as let vowelRegex = /a|e|i|o|u/gi;
let result = quoteSample.match(vowelRegex);
console.log(result); // Returns e,a,e,o,u,i,e,a,o,e,o,e,I,a,e,o,o,e,i,o,e,o,i,e,i
We can use character sets to specify a group of characters to match, but that's a lot of typing
when we need to match a large range of characters (for example, every letter in the alphabet).
Fortunately, there is a built-in feature that makes this short and simple.
Inside a character set, we can define a range of characters to match using a hyphen " - " character.
Example, to match lowercase letters a through e we would use [a-e].
let catStr = "cat";
let batStr = "bat";
let matStr = "mat";
let Regex = /[a-e]at/;
catStr.match(Regex); // Returns ["cat"]
batStr.match(Regex); // Returns ["bat"]
matStr.match(Regex); // Returns null
let quoteSample = "The quick brown fox jumps over the lazy dog.";
let alphabetRegex = /[a-z]/ig;
let result = quoteSample.match(alphabetRegex);
console.log(result); // Returns T,h,e,q,u,i,c,k,b,r,o,w,n,f,o,x,j,u,m,p,s,o,v,e,r,t,h,e,l,a,z,y,d,o,g
Using the hyphen " - " to match a range of characters is not limited to letters.
It also works to match a range of numbers.
For example, /[0-5]/ matches any number between 0 and 5, including the 0 and 5.
Also, it is possible to combine a range of letters and numbers in a single character set.
Example
let jennyStr = "Jenny8675309";
let myRegex = /[a-z0-9]/ig; // matches all letters and numbers in jennyStr
jennyStr.match(myRegex); // Returns J,e,n,n,y,8,6,7,5,3,0,9
Sometimes, we need to match a character (or group of characters) that appears one or more times in a row.
This means it occurs at least once, and may be repeated.
We can use the " + " character to check if that is the case.
Remember, the character or pattern has to be present consecutively.
That is, the character has to repeat one after the other.
For example, /a+/g would find one match in "abc" and return ["a"].
Because of the " + ", it would also find a single match in "aabc" and return ["aa"].
If it were instead checking the string "abab", it would find two matches and return ["a", "a"]
because the a characters are not in a row - there is a b between them.
Finally, since there is no "a" in the string "bcd", it wouldn't find a match.
Example
let difficultSpelling = "Mississippi";
let myRegex = /s+/g;
let result = difficultSpelling.match(myRegex);
console.log(result); // Returns "ss,ss"
There's also an option that matches characters that occur zero or more times.
The character to do this is the asterisk or star " * " behind the character that may or may not exist.
Example
let soccerWord = "gooooooooal!";
let gPhrase = "gut feeling";
let oPhrase = "over the moon";
let goRegex = /go*/;
soccerWord.match(goRegex); // Returns ["goooooooo"]
gPhrase.match(goRegex); // Returns ["g"]
oPhrase.match(goRegex); // Returns null
let chewieQuote = "Aaaaaaaaaaaaaaaarrrgh!";
let chewieRegex = /Aa*/;
let result = chewieQuote.match(chewieRegex);
console.log(result); // Returns "Aaaaaaaaaaaaaaaa"
In regular expressions, a greedy match finds the longest possible part of a string that fits
the regex pattern and returns it as a match.
The alternative is called a lazy match, which finds the smallest possible part of the string
that satisfies the regex pattern.
We can apply the regex /t[a-z]*i/ to the string "titanic".
This regex is basically a pattern that starts with t, ends with i, and has some letters in between.
Regular expressions are by default greedy, so the match would return ["titani"].
It finds the largest sub-string possible to fit the pattern.
However, we can use the " ? " character to change it to lazy matching.
"titanic" matched against the adjusted regex of /t[a-z]*?i/ returns ["ti"].
Example
let text = "<h1>Winter is coming</h1>";
let myRegex = /<.*?>/;
let result = text.match(myRegex);
console.log(result); // Returns "<h1>"
// Where as
let myRegex = /<.*>/;
let result = text.match(myRegex);
console.log(result); // Returns "<h1>Winter is coming</h1>"
We can create a set of characters that we do not want to match.
These types of character sets are called negated character sets.
To create a negated character set, we place a caret character " ^ " after the opening bracket
and before the characters we do not want to match.
For example, /[^aeiou]/gi matches all characters that are not a vowel.
Note that characters like ., !, [, @, / and white space are matched
the negated vowel character set only excludes the vowel characters.
Outside of a character set, the caret " ^ " is used to search for patterns at the beginning of strings.
Example
let firstString = "Ricky is first and can be found.";
let firstRegex = /^Ricky/;
firstRegex.test(firstString); // Returns true
let notFirst = "You can't find Ricky now.";
firstRegex.test(notFirst); // Returns false
We can search the end of strings using the dollar sign character " $ " at the end of the regex.
Example
let theEnding = "This is a never ending story";
let storyRegex = /story$/;
storyRegex.test(theEnding); // Returns true
let noEnding = "Sometimes a story will have to end";
storyRegex.test(noEnding); // Returns false
Using character classes, we are able to search for all letters of the alphabet with [a-z].
This kind of character class is common enough that there is a shortcut for it,
although it includes a few extra characters as well.
The closest character class in JavaScript to match the alphabet is \w, with a lowercase w.
This shortcut is equal to [A-Za-z0-9_].
This character class matches upper and lowercase letters plus numbers.
Note, this character class also includes the underscore character " _ ".
Example
let longHandRegex = /[A-Za-z0-9_]+/;
let shortHandRegex = /\w+/;
let numbers = "42";
let varNames = "important_var";
longHandRegex.test(numbers); // Returns true
shortHandRegex.test(numbers); // Returns true
longHandRegex.test(varNames); // Returns true
shortHandRegex.test(varNames); // Returns true
These shortcut character classes are also known as shorthand character classes.
let quoteSample = "The five boxing wizards jump quickly.";
let alphabetRegexV2 = /\w+?/ig; // Returns whole string
let result = quoteSample.match(alphabetRegexV2).length;
console.log(result); // Returns 31 (characters)
We can search for the opposite of the \w with \W.
Note, the opposite pattern uses a capital letter. This shortcut is the same as [^A-Za-z0-9_].
Example
let shortHandRegex = /\W/;
let numbers = "42%";
let sentence = "Coding!";
numbers.match(shortHandRegex); // Returns ["%"]
sentence.match(shortHandRegex); // Returns ["!"]
The shortcut to look for digit characters is \d, with a lowercase d.
This is equal to the character class [0-9], which looks for a single character of any number between (and including) zero and nine.
Example
let numString = "Your sandwich will be $5.00";
let numRegex = /\d/g;
let result = numString.match(numRegex).length;
console.log(result); // Returns "3" / digits
The shortcut to look for non-digit characters is \D.
This is equal to the character class [^0-9], which looks for a single character that is not a number between zero and nine.
Example
let numString = "Your sandwich will be $5.00";
let noNumRegex = /\D/g;
let result = numString.match(noNumRegex).length;
console.log(result); // Returns "24" / Non-digits
Example
let username1 = "JackOfAllTrades";
let username2 = "Oceans11";
let username3 = "007";
let userCheck = /^[a-z]{2,}\d*$/i;
let result1 = userCheck.test(username1);
let result2 = userCheck.test(username2);
let result3 = userCheck.test(username3);
console.log(result1); // Returns true
console.log(result2); // Returns true
console.log(result3); // Returns false
let userCheck = /^[a-z]{2,}\d*$/i; explanation
First ^[a-z] A username can only use alphabet letter characters.
Second {2,} Usernames have to be at least two characters long. Check [2][25]
Third \d*$ The only numbers in the username have to be at the end. There can be zero or more.
Fourth /i Both lowercase and UPPERCASE are permitted
Note: /^[A-za-z]/ would work too, same as /^[a-z]/i
We can match the whitespace or spaces between letters. We can search for whitespace using \s, which is a lowercase s.
This pattern not only matches whitespace, but also carriage return, tab, form feed, and new line characters.
We can think of it as similar to the character class [ \r\t\f\n\v].
Example
let whiteSpace = "Whitespace. Whitespace everywhere!"
let spaceRegex = /\s/g;
whiteSpace.match(spaceRegex); // Returns [" ", " "]
let sample = "Whitespace is important in separating words";
let countWhiteSpace = /\s/g;
let result = sample.match(countWhiteSpace).length;
console.log(result); //Returns 5
We can also search for everything except whitespace.
Search for non-whitespace using \S, which is an UPPERCASE S.
This pattern will not match whitespace, carriage return, tab, form feed, and new line characters.
We can think of it being similar to the character class [^ \r\t\f\n\v].
Example
let whiteSpace = "Whitespace. Whitespace everywhere!"
let nonSpaceRegex = /\S/g;
let result = whiteSpace.match(nonSpaceRegex).length;
console.log(result); // Returns 32
We use the plus sign " + " to look for one or more characters and the asterisk " * " to look for zero or more characters.
These are convenient but sometimes we want to match a certain range of patterns.
We can specify the lower and upper number of patterns with quantity specifiers.
Quantity specifiers are used with curly brackets " { " and " } ".
We put two numbers between the curly brackets - for the lower and upper number of patterns.
For example, to match only the letter a appearing between 3 and 5 times in the string "ah", our regex would be /a{3,5}h/.
let A4 = "aaaah";
let A2 = "aah";
let multipleA = /a{3,5}h/;
multipleA.test(A4); // Returns true
multipleA.test(A2); // Returns false
let ohStr = "Ohhh no";
let ohRegex = /Oh{3,6}\sno/;
let result = ohRegex.test(ohStr);
Remember to use \s after Oh{3,6} to include a white space, followed by no to pass all test cases.
All test cases are written using a capital O, however the testcases could also be passed by using ignore-case
/oh{3,6}\sno/i
We can specify the lower and upper number of patterns with quantity specifiers using curly brackets.
Sometimes we only want to specify the lower number of patterns with no upper limit.
To only specify the lower number of patterns, keep the first number followed by a comma.
For example, to match only the string "hah" with the letter a appearing at least 3 times, our regex would be /ha{3,}h/.
let A4 = "haaaah";
let A2 = "haah";
let A100 = "h" + "a".repeat(100) + "h";
let multipleA = /ha{3,}h/;
multipleA.test(A4); // Returns true
multipleA.test(A2); // Returns false
multipleA.test(A100); // Returns true
To specify a certain number of patterns, just have that one number between the curly brackets.
For example, to match only the word "hah" with the letter a 3 times, our regex would be /ha{3}h/.
let A4 = "haaaah";
let A3 = "haaah";
let A100 = "h" + "a".repeat(100) + "h";
let multipleHA = /ha{3}h/;
multipleHA.test(A4); // Returns false
multipleHA.test(A3); // Returns true
multipleHA.test(A100); // Returns false
Sometimes the patterns we want to search for may have parts of it that may or may not exist.
However, it may be important to check for them nonetheless.
We can specify the possible existence of an element with a question mark " ? ".
This checks for zero or one of the preceding element.
We can think of this symbol as saying the previous element is optional.
Example
let american = "color";
let british = "colour";
let rainbowRegex= /colou?r/;
rainbowRegex.test(american); // Returns true
rainbowRegex.test(british); // Returns true
Lookaheads are patterns that tell JavaScript to look-ahead in our string to check for patterns further along.
This can be useful when we want to search for multiple patterns over the same string.
There are two kinds of lookaheads: positive lookahead and negative lookahead.
A positive lookahead will look to make sure the element in the search pattern is there, but won't actually match it.
A positive lookahead is used as " (?=...) " where the " ... " is the required part that is not matched.
On the other hand, a negative lookahead will look to make sure the element in the search pattern is not there.
A negative lookahead is used as " (?!...) " where the " ... " is the pattern that we do not want to be there.
The rest of the pattern is returned if the negative lookahead part is not present.
Lookaheads are a bit confusing but some examples will help.
let quit = "qu";
let noquit = "qt";
let quRegex= /q(?=u)/;
let qRegex = /q(?!u)/;
quit.match(quRegex); // Returns ["q"]
noquit.match(qRegex); // Returns ["q"]
A more practical use of lookaheads is to check two or more patterns in one string.
Here is a simple password checker that looks for between 3 and 6 characters and at least one number:
let password = "abc123";
let checkPass = /(?=\w{3,6})(?=\D*\d)/;
checkPass.test(password); // Returns true
Some patterns we search for will occur multiple times in a string.
It is wasteful to manually repeat that regex.
There is a better way to specify when we have multiple repeat substrings in your string.
We can search for repeat substrings using capture groups. Parentheses, " ( " and " ) ", are used to find repeat substrings.
We put the regex of the pattern that will repeat in between the parentheses.
To specify where that repeat string will appear, we use abackslash " \ " and then a number.
This number starts at 1 and increases with each additional capture group we use.
An example would be \1 to match the first group.
The example below matches any word that occurs twice separated by a space:
let repeatStr = "regex regex";
let repeatRegex = /(\w+)\s\1/;
repeatRegex.test(repeatStr); // Returns true
repeatStr.match(repeatRegex); // Returns ["regex regex", "regex"]
Using the .match() method on a string will return an array with the string it matches, along with its capture group.
Searching is useful. However, we can make searching even more powerful when it also changes (or replaces) the text we match.
We can search and replace text in a string using " .replace() " on a string.
The inputs for .replace() is first the regex pattern we want to search for.
The second parameter is the string to replace the match or a function to do something.
Example
let wrongText = "The sky is silver.";
let silverRegex = /silver/;
wrongText.replace(silverRegex, "blue"); // Returns "The sky is blue."
We can also access capture groups in the replacement string with dollar signs " $ ".
"Code Camp".replace(/(\w+)\s(\w+)/, '$2 $1'); // Returns "Camp Code"
Sometimes whitespace characters around strings are not wanted but are there.
Typical processing of strings is to remove the whitespace at the start and end of it.
Example
let hello = " Hello, World! ";
let wsRegex = /^\s+|\s+$/g;
let result = hello.replace(wsRegex, ''); Returns "Hello, World!"
A summary of all mentioned Regular Expressions.
[2][1] .test() // Returns true or false
[2][4] .match() // Returns the actual match(es)
[2][5] /ignorecase/i // Ignore UPPER- and lowerCaSe
[2][6] /Repeat/g // Check on Repeats
[2][7] /hu./ // . dot is the Wildcard
[2][8] /[aeiou]/ // [ ] match any of the lowercase letters aeiou
[2][9] /[a-e]at/ // match any of the lowercase letters a through e
[2][10] /[a-z0-9]/ // [a-z0-9] matches any of the lowercase a through z and 0 through 9
[2][11] /s+/ // + matches 1 s or more. Note. without other characters in between
[2][12] /go*/ // * matches g plus or minus the o, which may be repeated several times
[2][13] /<.*?>/ // ? lazy match, which finds the smallest possible part
[2][14] /[^aeiou]/ // /[^ matches all characters that are not a vowel
[2][15] /^Ricky/ // /^ search for patterns at the beginning of strings
[2][16] /story$/ // $ search for patterns at the end of strings
[2][17] /\w/ // \w matches equal to [A-Za-z0-9_]
[2][18] /\W/ // \W matches Unequal to [A-Za-z0-9_]
[2][19] /\d/ // \d matches equal to the character class [0-9]
[2][20] /\D/ // \D matches Unequal to the character class [0-9]
[2][22] /\s/ // \s match whitespace, carriage return, tab, form feed, and new line characters
[2][23] /\S/ // \S does NOT match whitespace, carriage return, tab, form feed, and new line characters
[2][24] /a{3,5}h/ // {3,5} match only the letter a appearing between 3 and 5 times in the string "ah"
[2][25] /a{3,}h/ // {3,} match only the letter a appearing between 3 and more times in the string "ah"
[2][26] /a{3}h/ // {3} match only the letter a appearing 3 times in the string "ah"
[2][27] /colou?r/ // ? match both color and colour
[2][28] /q(?=u)/ // (?=...) match any q that is followed by an u
[2][28] /q(?!u)/ // (?!...) match any q that is not followed by an u
[2][29] /(\w+)\s\1/ // \1 matches any word that occurs twice separated by a space
[2][30] .replace() // search and replace text in a string
[2][31] /^\s+|\s+$/ // remove the whitespace at the start and end
[2][21] /^[a-z]{2,}\d*$/i // ^[a-z] only use alphabet letter characters
// {2,} be at least two characters long
// \d*$ numbers have to be at the end. There can be zero or more.
// /i Both lowercase and UPPERCASE are permitted