Tomita's Parser

Next: CYK Method Up: BOTTOM-UP PARSING Previous: Parsers Contents

Tomita's Parser

Tomita's Algorithm amounts to a simple generalisation of the standard LR algorithm. The LR parse-table is constructed as usual, and we proceed with the parsing algorithm in the normal way.

In order to keep a record of the parse-state, we maintain a stack consisting of symbol/state pairs. This is maintained as follows:

Each time we make a shift action we push the symbol and the new state onto the stack.
A reduce action for some rule whose right-hand-side has length involves popping symbol/state pairs off the stack, and then pushing the symbol and state corresponding to the rule's left-hand-side.

Thus at any stage during the parse, the current state will be on top of the stack. When we meet a conflict arising from an inadequate state, we simply duplicate the stack, and split the parse into a different process for each stack. The input string is accepted if any of the parallel parsings terminate successfully.

In general the effectiveness of this algorithm is proportional to the ``strength'' of the initial parse table; e.g. a Tomita Parser constructed from an LALR(1) table can expect to be more efficient than one constructed from an LR(0) table.

Next: CYK Method Up: BOTTOM-UP PARSING Previous: Parsers Contents

James Power 2002-11-29