vi Regular Expressions and Substitution

2009 Mar 01


The general form of the
substitute command is:
   :[addr1[,addr2]]s/old/new/[flags]

Omitting the search pattern
   :s//repl/
uses the last search or substitution
regular expression.

An empty new part
   :s/new//
replaces the matched text with 
nothing (deleting it).

Can use any nonalphanumeric and non
whitespace character instead of /.
Flag Meaning
c confirm each substitution
g global: change all occurences on a line (instead of first)
p print the line after the change is made
Meta Meaning in Replacement String
& Entire string matched by search pattern
~ Replacement text specified in last substitute command
\u
\l
Changes next character in replacement to upper case.
Changes next character in replacement to lower case.
\U
\L
Changes all following characters in replacement to upper case.
Changes all following character in replacement to lower case.

Regular Expressions
regex Meaning
. Matches any single character except newline spaces/tabs are treated as characters).
* Matches 0 or more (as many as there are) of the character that immediately precedes it.
^ When used at start of a regex means what follows must start at beginning of line.
When not at start of regex ^ stands for itself.
$ When used at end of a regex means what preceeds be at end of line.
When not at end of regex $ stands for itself.
\ Treats following special character as ordinary, \\ means just \
[ ] Matches any one of characters enclosed between the brackets. A range of characters can
be specified by separating first and last character with a hypen (e.g. a-f). More than one
range can be specified, and a mix of single characters and ranges can be specified.

Most metcharacters lose their special meaning inside brackets, so they do not need to be escaped.
The three metacharacters that need to be escaped are: - \ ].
If - is first character it does not need to be escaped.

A caret ^ has special meaning only when it is the first character inside the brackets,
in which case it means NOT the following characters.
\(  \) Saves the pattern enclosed between \( and \) into a special holding buffer.
Up to nine patterns can be saved in this way on a single line.
Then the \n notation can be used within a search or substitute string:
   :s/\(abcd\)\1/soup/
changes abcdabcd to soup.
\<
\>
Matches the beginning of a word.
Matches the end of a word.
The beginning or end of a word is determined by a punctuation mark or space.
~ Matches whatever regex was used in last search.

POSIX Bracket Expressions
Class Meaning
[:alnum:] Alphanumerics characters
[:alpha:] Alphabetic characters
[:blank:] Space and Tab characters
[:cntrl:] Control characters
[:digit:] Numeric characters
[:graph:] Printable and visable (non blank) characters
[:lower:] Lowercase characters
[:print:] Printable characters (includes whitespace)
[:punct:] Punctuation characters
[:space:] Whitespace characters
[:upper:] Supeprcase characters
[:xdigit:] Hexadeciaml digits

2005-2009