Alphabet

The 'C' language uses a subset of the characters available from the ASCII set. At different parts of a program different subsets of this alphabet are allowed. A distinction needs to be made between The set of characters used to write source programs and the set used in the target environment. In particular, the target environment set must include the NUL character, i.e. one with all bits set to 0; while the source alphabet does not include the NUL character; it is only a data value.

In 'C', upper and lower case letters are in this source alphabet, and they are considered to be each completely different from each other. Thus 'a' is different from 'A'. The underscore character, '_', is considered to be a letter. Thus there are 53 letters.

The 10 digits '0' through '9' are part of the source alphabet.

The source alphabet also contains the following 28 printable characters:

	! " # % & ' ( ) * + , - . / : ; < = > ? [ \ ] ^ { | } ~

Lastly, the source alphabet contains the non printable control characters (as white space):

	SP  (space)
	HT  (horizontal tab)
	FF  (form feed)
	VT  (vertical tab)
	LF  (line feed or new line)

The remaining ASCII codes, as well as the byte values 128 to 255 can be used by programs as data - i.e. in the target environment or alphabet. If they appear in a source file where the compiler expects part of a statement a fatal error will be issued.