[lex.pptoken] (original) (raw)

A preprocessing token is the minimal lexical element of the language in translation phases 3 through 6.

In this document, glyphs are used to identify elements of the basic character set ([lex.charset]).

The categories of preprocessing token are: header names, placeholder tokens produced by preprocessing import and module directives (import-keyword, module-keyword, and export-keyword), identifiers, preprocessing numbers, character literals (including user-defined character literals), string literals (including user-defined string literals), preprocessing operators and punctuators, and single non-whitespace characters that do not lexically match the other preprocessing token categories.

If a U+0027 apostrophe or a U+0022 quotation mark character matches the last category, the program is ill-formed.

If any character not in the basic character set matches the last category, the program is ill-formed.

Preprocessing tokens can be separated bywhitespace;this consists of comments ([lex.comment]), or whitespace characters (U+0020 space,U+0009 character tabulation, new-line,U+000b line tabulation, andU+000c form feed), or both.

As described in [cpp], in certain circumstances during translation phase 4, whitespace (or the absence thereof) serves as more than preprocessing token separation.

Whitespace can appear within a preprocessing token only as part of a header name or between the quotation characters in a character literal or string literal.