[lex.pptoken] (original) (raw)

5 Lexical conventions [lex]

5.4 Preprocessing tokens [lex.pptoken]

preprocessing-token: header-name import-keyword module-keyword export-keyword identifier pp-number character-literal user-defined-character-literal string-literal user-defined-string-literal preprocessing-op-or-punc each non-white-space character that cannot be one of the above

Each preprocessing token that is converted to a tokenshall have the lexical form of a keyword, an identifier, a literal, or an operator or punctuator.

A preprocessing token is the minimal lexical element of the language in translation phases 3 through 6.

The categories of preprocessing token are: header names, placeholder tokens produced by preprocessing import and module directives (import-keyword, module-keyword, and export-keyword), identifiers, preprocessing numbers, character literals (including user-defined character literals), string literals (including user-defined string literals), preprocessing operators and punctuators, and single non-white-space characters that do not lexically match the other preprocessing token categories.

If a ' or a " character matches the last category, the behavior is undefined.

Preprocessing tokens can be separated bywhite space;this consists of comments, or white-space characters (space, horizontal tab, new-line, vertical tab, and form-feed), or both.

As described in [cpp], in certain circumstances during translation phase 4, white space (or the absence thereof) serves as more than preprocessing token separation.

White space can appear within a preprocessing token only as part of a header name or between the quotation characters in a character literal or string literal.

If the input stream has been parsed into preprocessing tokens up to a given character:

[ Example

:

#define R "x" const char* s = R"y";

end example

]

The import-keyword is produced by processing an import directive ([cpp.import]), the module-keyword is produced by preprocessing a module directive ([cpp.module]), and the export-keyword is produced by preprocessing either of the previous two directives.

[ Note

:

None has any observable spelling.

end note

]

[ Example

:

The program fragment 0xe+foo is parsed as a preprocessing number token (one that is not a validinteger-literal or floating-point-literal token), even though a parse as three preprocessing tokens0xe, +, and foo might produce a valid expression (for example, if foo were a macro defined as 1).

Similarly, the program fragment 1E1 is parsed as a preprocessing number (one that is a valid floating-point-literal token), whether or not E is a macro name.

end example

]

[ Example

:

The program fragment x+++++y is parsed as x++ ++ + y, which, if x and y have integral types, violates a constraint on increment operators, even though the parsex ++ + ++ y might yield a correct expression.

end example

]