| 2. |
Each instance of a backslash character (\) immediately followed by a new-line character is deleted, splicing physical source lines to form logical source lines. Only the last backslash on any physical source line shall be eligible for being part of such a splice. Except for splices reverted in a raw string literal, if a splice results in a character sequence that matches the syntax of a universal-character-name, the behavior is undefined. A source file that is not empty and that does not end in a new-line character, or that ends in a new-line character immediately preceded by a backslash character before any such splicing takes place, shall be processed as if an additional new-line character were appended to the file. |
| 3. |
The source file is decomposed into preprocessing tokens ([lex.pptoken]) and sequences of whitespace characters (including comments). A source file shall not end in a partial preprocessing token or in a partial comment.[10](#footnote-10 "A partial preprocessing token would arise from a source file ending in the first portion of a multi-character token that requires a terminating sequence of characters, such as a header-name that is missing the closing " or >. A partial comment would arise from a source file ending with an unclosed /* comment.") Each comment is replaced by one space character. New-line characters are retained. Whether each nonempty sequence of whitespace characters other than new-line is retained or replaced by one space character is unspecified. The process of dividing a source file's characters into preprocessing tokens is context-dependent. [Example 1: See the handling of < within a #include preprocessing directive. — _end example_] |
| 4. |
Preprocessing directives are executed, macro invocations are expanded, and _Pragma unary operator expressions are executed. If a character sequence that matches the syntax of auniversal-character-name is produced by token concatenation, the behavior is undefined. A#include preprocessing directive causes the named header or source file to be processed from phase 1 through phase 4, recursively. All preprocessing directives are then deleted. |
| 5. |
Each basic source character set member in a character-literal or astring-literal, as well as each escape sequence and universal-character-name in acharacter-literal or a non-raw string literal, is converted to the corresponding member of the execution character set ([lex.ccon], [lex.string]); if there is no corresponding member, it is converted to an implementation-defined member other than the null (wide) character.11 |
| 6. |
Adjacent string literal tokens are concatenated. |
| 7. |
White-space characters separating tokens are no longer significant. Each preprocessing token is converted into a token ([lex.token]). The resulting tokens are syntactically and semantically analyzed and translated as a translation unit. [Note 1: The process of analyzing and translating the tokens can occasionally result in one token being replaced by a sequence of other tokens ([temp.names]). — end note_] It isimplementation-defined whether the sources for module units and header units on which the current translation unit has an interface dependency ([module.unit], [module.import]) are required to be available. [_Note 2: Source files, translation units and translated translation units need not necessarily be stored as files, nor need there be any one-to-one correspondence between these entities and any external representation. The description is conceptual only, and does not specify any particular implementation. — _end note_] |
| 8. |
Translated translation units and instantiation units are combined as follows:[Note 3: Some or all of these can be supplied from a library. — end note_]Each translated translation unit is examined to produce a list of required instantiations. [_Note 4: This can include instantiations which have been explicitly requested ([temp.explicit]). — end note_] The definitions of the required templates are located. It is implementation-defined whether the source of the translation units containing these definitions is required to be available. [_Note 5: An implementation can choose to encode sufficient information into the translated translation unit so as to ensure the source is not required here. — end note_] All the required instantiations are performed to produceinstantiation units. [_Note 6: These are similar to translated translation units, but contain no references to uninstantiated templates and no template definitions. — _end note_] The program is ill-formed if any instantiation fails. |
| 9. |
All external entity references are resolved. Library components are linked to satisfy external references to entities not defined in the current translation. All such translator output is collected into a program image which contains information needed for execution in its execution environment. |