click to edit title
Regular expressions- language to find, replace and extract content from text
you may want to extract phone numbers from a document
escape character- signifies that the next character in the pattern should be interpreted as a special command. in regex, this is the . and \d means we want to find any digit
\D finds any character except digits
\w finds digits betwon 0-9 and letters from a-z
. is the kind of regex
? mark makes characters optional, and the character preceding is not neccesary for a match
\s looks for whitespace. \S looks for anything but whitespace
a/dc looks for any word starting with a and ending with c
a/Dc looking for patterns where a part of the string is anything but a digit
\W finds any non alphanumeric character
a\Sb finds any three character pattern starting with a and ending with c as long as the second character isn't whitespace
hard brackets specify that one character in the specific position can be any character listed inside the brackets
caret at the start inside of bracket says that the characters listed can't exist
the pipe command means "or" |
^ caret specifices the beginning of the string being examined
ab$ specifices the end of the string being examined
ab* means pattern begins with a may contain any number of b's
ab+ pattern begins with a and may contain any number of bs
a{3}- the pattern is three "a."
a{2, 4}- the pattern has between two and four a's
(Mr?s) there's the optionality to factor in both identifiers, then the parentheses caputre whatever substring is matched by the pattern
0-9: RegEx looks for the exact digit or set of digits you specify
a-z or A-Z: RegEx looks for the exact set of characters you specify, lowercase or uppercase
\d looks for any one digit
\D any one character except for digits
\w- any alphanumeric character
\W any non-alphanumeric character
? character following a questionmark is optional. . is any character
/s any whitespace
\S anything but a whitespace
[] is any alphanumeric character listed within the brackets
^ any pattern must begin from the start of the string being examined
- zero or more repitions, where + means one or more repititons
{m,n}- the pattern preceding the curly brackets should repeat "m" to "n" times