Metacharacter | Meaning |
\ | Quote following metacharacter to be itself. In a character class the special characters are "-]\^$"; if a '-' is the first or last char in a class it is oridinary. |
/ | Default regex delimiter |
^ | Match beginning of string (not line). In a character class a starting '^' negates the set of characters. Both [...] and [^...] must match a character, or the match fails. |
$ | Match end of line (or before newline of the end). Also used to denote a scalar variable (with the aid of {...}). |
. | Match any character (except newline, unless "/s" modifier) |
? | Match 0 or 1 of any character (except newline, unless "/s" modifier) |
* | Match 0 or more of previous |
+ | Match 1 or more of previous |
| | Alternation /(^a|b)c/; # matches 'ac' at start of string or 'bc' anywhere. |
[ ] | Character class (or set) |
( ) | Grouping (treats contents as single unit). Also allows the extraction of the parts of a string that matched: $time =~ /(\d\d):(\d\d):(\d\d)/; # match hh:mm:ss format $hours = $1; $ first () match $min = $2; $ second () match $sec = $3; $ third () match In list context, a match "/regex/" with groupings returns the list of matched vales "($1, $2, ...)". Thus an equivalent way is: ($hours, $min, $sec) = ($time =~ /(\d\d):(\d\d):(\d\d)/); Matching variables are as follows:
/(ab(cd|ef)((gi)j))/;
Associated with the matching varaiables "$1", "$2", ... are thebackreferences "\1|, "\2", ... that can be used only inside a regex: /(\w+)\s\1/; # finds repeat sequences like 'the the' in string. There are also the two arrays "@-" and "@+" which are the start and end positions of each match. "$-[0]" is the position of the start of the entire match, "$+[0]" is the position of the end; "$-[n]" and "$+[n]" are the start and end positions respectively of match "$n" if it exists. Note: if $v is a variable used as an integer with value 5, then "${$e}" is same as "$5". "$`" is the part of the string before the match; substr($x, 0, $-[0]). "$&" is the part of the string which matched; substr($x, $-[0], $+[0]-$-[0]). "$'" is the part of the string after the match; substr($x, $+[0]). If there are no groupings, a list of matchings to the whole regex is returned. |
{ } | contains name or quantifier range |