We have shown how to convert a regular expression to an NFA; the proof that
these represent the same language is straightforward (by induction). However,
we have asserted that there is a correspondence in both directions; this
is formalised in Kleene's Theorem:
KLEENE'S THEOREM, PART 1: To each regular expression there corresponds a NFA
Strategy: The corresponding NFA is the one constructed by Thompson's Algorithm;
proof that it is equivalent is by induction over regular expressions.
Proof: Boring!
KLEENE'S THEOREM, PART 2: To each NFA there corresponds a
regular expression
Strategy: Proof by induction over the states of
the NFA (sort of).
The language represented by a NFA can be
partitioned into the union of a number of smaller languages, each
defined as follows:
Let the states of the NFA be numbered from 1 to.
Letand
be states, and let
be a number such that
.
Then:
Equally, if we can show that has a corresponding regular
expression for all
,
and
, we will have proved the theorem.
The proof of this will proceed by induction over
.
Proof: (Kleene's Theorem, part 2)
Suppose the machine consumes some string
; we must find a
regular expression corresponding to
. Now suppose also that it
passes through state
on its way (if it doesn't, then
is in
, which has a corresponding regular expression by the induction
hypothesis). Since the machine can ``loop'' on
arbitrarily many times,
we can split
up into three substrings:
By the induction hypothesis each of these have corresponding regular
expressions, say A, B and C respectively, and thus
the regular expression for is A(B*)C.
Since our choice of was arbitrary, we have proved the theorem.