Constructing LR(1) Item sets

Next: Parsers Up: Parsers Previous: Parsers Contents

Constructing LR(1) Item sets

In order to construct the sets of items for an parser, we just need to make three modifications:

The Initial Item
The initial item is now of the form $[S'\, \rightarrow\, {}_{\triangle} S,\; \bot]$ . That is we want to recognise an (or something corresponding to it) and reduce by this rule, on the understanding that the next symbol is the end of the input (ie. that there is no more input).
The move operation
As might be expected, ${\sf move}([A\, \rightarrow\, \alpha {}_{\triangle} B \gamma,\; a], B)$ = $[A\, \rightarrow\, \alpha B{}_{\triangle} \gamma,\; a]$ ; ie. move does not change the lookahead symbol
The closure Operation
Previously we had constructed the closure of the item $[A\, \rightarrow\, \alpha {}_{\triangle} B\gamma]$ by adding in all non-kernel items of the form $[B\, \rightarrow\, {}_{\triangle} \beta]$ . This will be the same for the item $[A\, \rightarrow\, \alpha {}_{\triangle} B \gamma , a]$ , but we need to compute a new lookahead set.

Since we are only interested in finding $\beta$ and reducing it to as a kind of a "sub-goal" of finding $B \gamma$ , we can state that after reducing to , the next input symbol should match whatever is derived from $\gamma$ ; ie. it should be in $FIRST(\gamma)$ . If $\gamma$ can derive $\varepsilon$ , then the next input symbol should match what had previously been expected to follow , ie. the lookahead symbol `a'.

Thus: ${\sf closure}([A\, \rightarrow\, \alpha {}_{\triangle} B \gamma , a])$ contains all items of the form $[B\, \rightarrow\, {}_{\triangle} \beta,\; b]$ , where `b' is a terminal in $FIRST(\gamma a)$

The algorithm for constructing a handle-recognising DFA is exactly the same as before, except that we use the above modifications.

One result of these changes is to increase the number of possible items, and also the number of likely states in the handle-recognising DFA; thus parsing tables for a grammar tend to be significantly larger than the or tables for the same grammar.

However, the parsing is improved, because now we can modify the reduction rule to get:

The LR(1) reduction rule:

Reduce by the rule $A\, \rightarrow\, \alpha_1 \ldots \alpha_n$ if and only if the item $[A\, \rightarrow\, \alpha_1 \ldots \alpha_n {}_{\triangle},\; a]$ is in the current state, and `a' is the next input symbol.

Since the number of `a' symbols may be smaller than , there can be less reductions in an parse table, and thus less potential conflicts.

Next: Parsers Up: Parsers Previous: Parsers Contents

James Power 2002-11-29