| 2. |
If the first translation character is U+feff byte order mark, it is deleted. Each sequence comprising a backslash character (\) immediately followed by zero or more whitespace characters other than new-line followed by a new-line character is deleted, splicing physical source lines to form logical source lines. Only the last backslash on any physical source line shall be eligible for being part of such a splice. [Note 2: Line splicing can form a universal-character-name ([lex.charset]). — _end note_] A source file that is not empty and that (after splicing) does not end in a new-line character shall be processed as if an additional new-line character were appended to the file. |
| 3. |
The source file is decomposed into preprocessing tokens ([lex.pptoken]) and sequences of whitespace characters (including comments). A source file shall not end in a partial preprocessing token or in a partial comment.[9](#footnote-9 "A partial preprocessing token would arise from a source file ending in the first portion of a multi-character token that requires a terminating sequence of characters, such as a header-name that is missing the closing " or >. A partial comment would arise from a source file ending with an unclosed /* comment.") Each comment ([lex.comment]) is replaced by one U+0020 space character. New-line characters are retained. Whether each nonempty sequence of whitespace characters other than new-line is retained or replaced by one U+0020 space character is unspecified. As characters from the source file are consumed to form the next preprocessing token (i.e., not being consumed as part of a comment or other forms of whitespace), except when matching ac-char-sequence,s-char-sequence,r-char-sequence,h-char-sequence, orq-char-sequence,universal-character-names are recognized ([lex.universal.char]) and replaced by the designated element of the translation character set ([lex.charset]). The process of dividing a source file's characters into preprocessing tokens is context-dependent. [Example 1: See the handling of < within a #include preprocessing directive ([lex.header], [cpp.include]). — _end example_] |
| 4. |
The source file is analyzed as a preprocessing-file ([cpp.pre]). Preprocessing directives ([cpp]) are executed, macro invocations are expanded ([cpp.replace]), and _Pragma unary operator expressions are executed ([cpp.pragma.op]). A #include preprocessing directive ([cpp.include]) causes the named header or source file to be processed from phase 1 through phase 4, recursively. All preprocessing directives are then deleted. Whitespace characters separating preprocessing tokens are no longer significant. |
| 5. |
For a sequence of two or more adjacent string-literal preprocessing tokens, a common encoding-prefix is determined as specified in [lex.string]. Each such string-literal preprocessing token is then considered to have that common encoding-prefix. |
| 6. |
Adjacent string-literal preprocessing tokens are concatenated ([lex.string]). |
| 7. |
Each preprocessing token is converted into a token ([lex.token]). The resulting tokens constitute a translation unit and are syntactically and semantically analyzed as a translation-unit ([basic.link]) and translated. [Note 3: The process of analyzing and translating the tokens can occasionally result in one token being replaced by a sequence of other tokens ([temp.names]). — end note_] It isimplementation-defined whether the sources for module units and header units on which the current translation unit has an interface dependency ([module.unit], [module.import]) are required to be available. [_Note 4: Source files, translation units and translated translation units need not necessarily be stored as files, nor need there be any one-to-one correspondence between these entities and any external representation. The description is conceptual only, and does not specify any particular implementation. — end note_] [_Note 5: Previously translated translation units can be preserved individually or in libraries. The separate translation units of a program communicate ([basic.link]) by (for example) calls to functions whose names have external or module linkage, manipulation of variables whose names have external or module linkage, or manipulation of data files. — end note_]While the tokens constituting translation units are being analyzed and translated, required instantiations are performed. [_Note 6: This can include instantiations which have been explicitly requested ([temp.explicit]). — end note_]The contexts from which instantiations may be performed are determined by their respective points of instantiation ([temp.point]).[_Note 7: Other requirements in this document can further constrain the context from which an instantiation can be performed. For example, a constexpr function template specialization might have a point of instantiation at the end of a translation unit, but its use in certain constant expressions could require that it be instantiated at an earlier point ([temp.inst]). — end note_]Each instantiation results in new program constructs. The program is ill-formed if any instantiation fails.During the analysis and translation of tokens, certain expressions are evaluated ([expr.const]). Constructs appearing at a program point P are analyzed in a context where each side effect of evaluating an expression Eas a full-expression is complete if and only if(1.7.1)E is the expression corresponding to a consteval-block-declaration ([dcl.pre]), and(1.7.2)either that consteval-block-declaration or the template definition from which it is instantiated is reachable from ([module.reach])(1.7.2.1)P, or(1.7.2.2)the point immediately following the class-specifier of the outermost class for which P is in a complete-class context ([class.mem.general]). [_Example 2: class S { class Incomplete;class Inner { void fn() { Incomplete i; } }; consteval { define_aggregate(^^Incomplete, {});} }; Constructs at are analyzed in a context where the side effect of the call to define_aggregate is evaluated because(1.7.3)E is the expression corresponding to a consteval block, and(1.7.4) is in a complete-class context of S and the consteval block is reachable from . — _end example_] |
| 8. |
Translated translation units are combined, and all external entity references are resolved. Library components are linked to satisfy external references to entities not defined in the current translation. All such translator output is collected into a program image which contains information needed for execution in its execution environment. |