Article

Regular Expressions in JavaScript

Page: 1 2 3 4 5 Next

Basic Syntax

First of all, a caret (^) may be used to indicate the beginning of the string, while a dollar sign ($) is used to mark the end:

Get Your Copy of Kevin Yanks Book NOW!JavaScript   // Matches "Isn’t JavaScript great?"  
^JavaScript  // Matches "JavaScript rules!",  
            //  not "What is JavaScript?"  
JavaScript$  // Matches "I love JavaScript",  
            //  not "JavaScript is great!"  
^JavaScript$ // Matches "JavaScript", and nothing else

Obviously, you may sometimes want to use ^, $, or other special characters to represent the corresponding character in the search string rather than the special meaning implied by regular expression syntax. To remove the special meaning of a character, prefix it with a backslash:

\$\$\$      // Matches "Show me the $$$!"

Square brackets may be used to define a set of characters that may match. For example, the following regular expression will match any digit from 1 to 5 inclusive.

[12345]     // Matches "1" and "3", but not "a" or "12"

Ranges of numbers and letters may also be specified.

[1-5]       // Same as the previous example  
[a-z]       // Matches any lowercase letter  
[0-9a-zA-Z] // Matches any letter or digit

By putting a ^ immediately following the opening square bracket, you can invert the set of characters, meaning the set will match any character not listed:

[^a-zA-Z]   // Matches anything except a letter

The characters ?, +, and * also have special meanings. Specifically, ? means "the preceding character is optional", + means "one or more of the previous character", and * means "zero or more of the previous character".

bana?na     // Matches "banana" and "banna",  
           // but not "banaana".  
bana+na     // Matches "banana" and "banaana",  
           // but not "banna".  
bana*na     // Matches "banna", "banana", and "banaaana",  
           // but not "bnana".  
^[a-zA-z]+$ // Matches any string of one or more  
           // letters and nothing else.

Parentheses may be used to group strings together to apply ?, +, or * to them as a whole.

ba(na)+na   // Matches "banana" and "banananana",  
           // but not "bana" or "banaana".

Parentheses also let you define several strings that may match, using the pipe (|) character to separate them.

^(ba|na)+$  // Matches “banana”, “nababa”, “baba”,  
           // “nana”, “ba”, “na”, and others.

Here are a few special codes that can be used for matching characters in regular expressions:

\n      // A newline character  
.       // Any character except a newline  
\r      // A carriage return character  
\t      // A tab character  
\b      // A word boundary (the start or end of a word)  
\B      // Anything but a word boundary  
\d      // Any digit (same as [0-9])  
\D      // Anything but a digit (same as [^0-9])  
\s      // Single whitespace (space, tab, newline, etc.)  
\S      // Single nonwhitespace  
\w      // A “word character” (same as [A-Za-z0-9_])  
\W      // A “nonword character” (same as [^A-Za-z0-9_])

There are more special codes and syntax tricks for regular expressions, all of which should be covered in any complete reference (such as those mentioned above). For now, we have more than enough for our purposes.

If you liked this article, share the love:
Print-Friendly Version Suggest an Article

Sponsored Links