Derivations and Parse Trees

Next: Leftmost and Rightmost Derivations Up: CONTEXT-FREE LANGUAGES Previous: BNF Notation Contents

Derivations and Parse Trees

Parsing is the process where we take a particular sequence of words and check to see if they correspond to the language defined by some grammar.

Basically what we have to do is to show how we can get:

from	the start symbol of the grammar
to	the sequence of words that supposedly belong to the language
by	using the production rules

We can do this by constructing the appropriate parse tree for the sequence of words, which is simply a graphical way of listing the production rules that we have used.

Considering the following simple grammar for arithmetic expressions:

$\begin{displaymath}\begin{array}{rcl} E & \rightarrow & E\; OP\; E\; \;\mid\; (\... ...ightarrow & + \mid\; -\; \mid\; *\; \mid\; \div \\ \end{array}\end{displaymath}$

Here there are just two non-terminals, and (where is the start symbol); the terminal symbols are: +, -, *, $\div$ , (, ), num.

Suppose the lexical analysis phase has given us the sequence of tokens which correspond to: (num + num) $\div$ num. We can parse this as follows:

$\begin{displaymath}\begin{array}{rcl\vert r} \hline \multicolumn{3}{c}{Rule\; Ap... ...f num} & {\bf ( num + num )}\; \div\; {\bf num} \\ \end{array}\end{displaymath}$

We say that derives (num + num) $\div$ num, and write:

$\rightarrow^*$ (num + num) $\div$ num

We can represent the above derivation graphically by means of a parse tree. The root of the tree is the start symbol , and its leaves are the terminal symbols in the sentence which has been derived. Each internal node is labelled with a non-terminal, and the children are the symbols obtained by applying one of the production rules.

=1.00mm
$\begin{picture}(95.00,100.00) \put(70.00,100.00){\makebox(0,0)[cc]{E}} \put(70.0... ...55.00,37){\vector(0,-1){13.5}} \put(40.00,37){\vector(0,-1){13.5}} \end{picture}$

There are two ways of constructing a parse tree whose root is the start symbol and whose leaves are the sentence:

Top-Down: As we did earlier, we start with the start symbol and apply the production rules (replacing l.h.s. with r.h.s.) until we derive the sentence
Bottom-Up: We start with the sentence, and apply the production rules in reverse (ie. replacing r.h.s. of the rule with the l.h.s.) until we derive the start symbol.

Both of these methods will be discussed in detail later.

Subsections

Leftmost and Rightmost Derivations

Next: Leftmost and Rightmost Derivations Up: CONTEXT-FREE LANGUAGES Previous: BNF Notation Contents

James Power 2002-11-29