Regular Expression Wildcards
Regular expressions are patterns used in UNIX commands to match the contents of a file.
Regular expression wildcards, also called metacharacters, are used to define the search parameters.
They include the following:
- ^ Beginning of Line (Must be first character in your pattern)
- $ End of Line (Must be the last character in your pattern )
- . Any one character
- [aprw] Matches one character from the listed set. This notation is called a character class.
- [a-c4-8] The listed set may include ranges of characters. Ranges are indicated using a dash between two characters.
This set includes all the characters from a to c, plus the digits from 4 to 8.
[^a-r]
You can reverse the meaning of the set by inserting a caret as the first character.
This matches any one character except a through r.
Matching Any Character
You could also match those pesky hyphens with a dot (.):
\d\d\d.\d\d\d.\d\d\d\d
The dot or period essentially acts as a wildcard and will match any character (except, in certain situations, a line ending). In the example above, the regular expression matches the hyphen, but it could also match a percent sign (%):
707%827%7019
Or a vertical bar (|):
707|827|7019
Or any other character.
the dot character (officially, the full stop) will not normally match a new line character, such as a line feed (U+000A).
However, there are ways to make it possible to match a newline with a dot, which I will show you later. This is often called the dotall option.