We have shown how to convert a regular expression to an NFA; the proof that
these represent the same language is straightforward (by induction). However,
we have asserted that there is a correspondence in both directions; this
is formalised in Kleene's Theorem:
KLEENE'S THEOREM, PART 1: To each regular expression there corresponds a NFA
Strategy: The corresponding NFA is the one constructed by Thompson's Algorithm;
proof that it is equivalent is by induction over regular expressions.
Proof: Boring!
KLEENE'S THEOREM, PART 2: To each NFA there corresponds a
regular expression
Strategy: Proof by induction over the states of
the NFA (sort of).
The language represented by a NFA can be
partitioned into the union of a number of smaller languages, each
defined as follows:
Let the states of the NFA be numbered from 1 to .
Let and be states, and let be a number such that .
Then:
Equally, if we can show that has a corresponding regular
expression for all , and , we will have proved the theorem.
The proof of this will proceed by induction over .
Proof: (Kleene's Theorem, part 2)
Suppose the machine consumes some string ; we must find a regular expression corresponding to . Now suppose also that it passes through state on its way (if it doesn't, then is in , which has a corresponding regular expression by the induction hypothesis). Since the machine can ``loop'' on arbitrarily many times, we can split up into three substrings:
By the induction hypothesis each of these have corresponding regular expressions, say A, B and C respectively, and thus the regular expression for is A(B*)C.
Since our choice of was arbitrary, we have proved the theorem.