Allow integer suffixes starting with e. by nnethercote · Pull Request #111628 · rust-lang/rust (original) (raw)

Integers with arbitrary suffixes are allowed as inputs to proc macros. A number of real-world crates use this capability in interesting ways, as seen in #103872. For example:

The hex cases may be surprising.

A proc macro will immediately stringify such tokens and reparse them itself, and so won't care that the token types vary. All suffixes must be consumed by the proc macro, of course; the only suffixes allowed after macro expansion are the numeric ones like u8, i32, and f64.

Currently there is an annoying inconsistency in how integer literal suffixes are handled, which is that no suffix starting with e is allowed, because that it interpreted as a float literal with an exponent. For example:

In each case, a sequence of digits followed by an 'e' or 'E' followed by a letter results in an "expected at least one digit in exponent" error. This is an annoying inconsistency in general, and a problem in practice. It's likely that some users haven't realized this inconsistency because they've gotten lucky and never used a token with an 'e' that causes problems. Other users have noticed; it's causing problems when embedding DSLs into proc macros, as seen in #111615, where the CSS colours case is causing problems for two different UI frameworks (Slint and Makepad).

We can do better. This commit changes the lexer so that, when it hits a possible exponent, it looks ahead and only produces an exponent if a valid one is present. Otherwise, it produces a non-exponent form, which may be a single token (e.g. 1eV) or multiple tokens (e.g. 1e+a).

Consequences of this:

This is a backwards compatible language change: all existing valid programs will be treated in the same way, and some previously invalid programs will become valid. The tokens chapter of the language reference (https://doc.rust-lang.org/reference/tokens.html) will need changing to account for this. In particular, the "Reserved forms similar to number literals" section will need updating, and grammar rules involving the SUFFIX_NO_E nonterminal will need adjusting.

Fixes #111615.

r? @ghost