Tokens are the indivisible or atomic lexical units out of which syntactic structures are built. They are the minimal meaningful elements of the language. Syntactic structures are made out of sequences of one or more tokens that are grammatically correct; that is, they follow allowable patterns.
The lexical part of the compiler collects the characters in the source stream using the longest sequence of characters that could possibly be a token and gives this sequence of tokens to the syntactic part of the compiler (for checking and translation to machine language).
A Token is one of the following:
keyword identifier constant string_literal operator punctuator
White space (including comments) cannot be within the sequence of characters constituting any of the multiple character tokens (e.g. keywords, identifiers, constants and operators), as this would cause the token to erroneously be broken up into two or more tokens.
Anywhere in the source program the pair of characters consisting of a backslash, '\', followed by a new-line character disappears, indicating logical continuation of the current input line.