next up previous contents
Next: Parsers Up: Parsers Previous: Parsers   Contents

Constructing LR(1) Item sets

In order to construct the sets of items for an $LR(1)$ parser, we just need to make three modifications:

  1. The Initial Item
    The initial item is now of the form $[S'\, \rightarrow\,
{}_{\triangle} S,\; \bot]$. That is we want to recognise an $S$ (or something corresponding to it) and reduce by this rule, on the understanding that the next symbol is the end of the input (ie. that there is no more input).

  2. The move operation
    As might be expected, ${\sf move}([A\, \rightarrow\, \alpha
{}_{\triangle} B \gamma,\; a], B)$ = $[A\, \rightarrow\, \alpha
B{}_{\triangle} \gamma,\; a]$; ie. move does not change the lookahead symbol

  3. The closure Operation
    Previously we had constructed the closure of the $LR(0)$ item $[A\, \rightarrow\, \alpha {}_{\triangle} B\gamma]$ by adding in all non-kernel items of the form $[B\, \rightarrow\,
{}_{\triangle} \beta]$. This will be the same for the $LR(1)$ item $[A\, \rightarrow\, \alpha {}_{\triangle} B \gamma , a]$, but we need to compute a new lookahead set.

    Since we are only interested in finding $\beta$ and reducing it to $B$ as a kind of a "sub-goal" of finding $B \gamma$ , we can state that after reducing to $B$, the next input symbol should match whatever is derived from $\gamma$; ie. it should be in $FIRST(\gamma)$. If $\gamma$ can derive $\varepsilon$, then the next input symbol should match what had previously been expected to follow $A$, ie. the lookahead symbol `a'.

    Thus: ${\sf closure}([A\, \rightarrow\, \alpha {}_{\triangle} B
\gamma , a])$ contains all items of the form $[B\, \rightarrow\,
{}_{\triangle} \beta,\; b]$, where `b' is a terminal in $FIRST(\gamma a)$

The algorithm for constructing a handle-recognising DFA is exactly the same as before, except that we use the above modifications.

One result of these changes is to increase the number of possible items, and also the number of likely states in the handle-recognising DFA; thus $LR(1)$ parsing tables for a grammar tend to be significantly larger than the $LR(0)$ or $SLR(1)$ tables for the same grammar.

However, the parsing is improved, because now we can modify the $SLR(1)$ reduction rule to get:

The LR(1) reduction rule:

Since the number of `a' symbols may be smaller than $FOLLOW(A)$, there can be less reductions in an $LR(1)$ parse table, and thus less potential conflicts.


next up previous contents
Next: Parsers Up: Parsers Previous: Parsers   Contents
James Power 2002-11-29