next up previous contents
Next: Regular Expressions Up: Notes on Formal Language Previous: Contents   Contents

REGULAR LANGUAGES

Lexical analysis, also called scanning, is the phase of the compilation process which deals with the actual program being compiled, character by character. The higher level parts of the compiler will call the lexical analyzer with the command "get the next word from the input", and it is the scanner's job to sort through the input characters and find this word.

The types of "words" commonly found in a program are:

Some languages (such as C) are case sensitive, in that they differentiate between eg. if and IF; thus the former would be a keyword, the latter a variable name.

Also, most languages would insist that identifers cannot be any of the keywords, or contain operator symbols (versions of Fortran don't, making lexical analysis quite difficult).

In addition to the basic grouping process, lexical analysis usually performs the following tasks:

In order to specify the lexical analysis process, what we need is some method of describing which patterns of characters correspond to which words.



Subsections
next up previous contents
Next: Regular Expressions Up: Notes on Formal Language Previous: Contents   Contents
James Power 2002-11-29