[RFC] Floating-point literals in LLVM IR (original) (raw)

I’d like to propose several changes to the various kinds of float literals that are available in LLVM IR. Specifically, I’d like to do the following:

Add support for hexadecimal literals in the style of printf’s %a modifier (i.e., something like 0x1.0p-23)
Add support for NaN and infinity literals (as proposed by [LLParser] Support identifiers like `nan` and `pinf` for special FP values by mshockwave · Pull Request #102790 · llvm/llvm-project · GitHub). This would allow the specification of arbitrary payloads for NaNs.
Remove the existing hexadecimal formats, which use 0x_, where _ is a code that depends on the type you’re trying to represent (e.g., K for x86_fp80, M for ppc_fp128).
Enforce the rule that literals that aren’t exact are invalid.

Motivation

The obvious motivation for these changes is that hexadecimal floating literals, along with nan and inf, are simply more readable than the current hexadecimal, especially if you haven’t memorized the hex bitcast-to-int representation of common doubles like 1.0. However, that actually isn’t my main reason for doing this.

At present, when LLLexer is lexing in the input, a floating literal is converted into an APFloat token, which needs to choose which format is being used. When there is a decimal literal, the double format are used (as the lexer doesn’t have the information of the type being parsed, only the parser does); the weird hexadecimal formats use the codes to select between the different formats. The parser then converts the double-based APFloat to the appropriate type, and only here checks for exactness of conversion (so double 0.1 is legal but float 0.1 is not). It is this parse-to-double-then-convert-to-T that I am most concerned with changing, especially because I am working on adding support for decimal floating point types in LLVM IR (as part of RFC: Decimal floating-point support (ISO/IEC TS 18661-2 and C23)).

The simple fix is to lex a floating literal as a string, and have the parser itself convert the string to the desired floating-point type, being able to catch illegal (overflowing, underflowing, inexact) conversions at the process, whether or not it was representible (exactly) as a double.

On string representations

For the standard IEEE 754 binary types, and bfloat, all finite values have unique representations, and these representations are concise in the hexadecimal floating-point literal. There are of course multiple NaN values, but the proposal for NaN support would include the ability to specify the NaN payload, so every representable value in these types has a distinct string that can represent it.

These properties do not hold for x86_fp80, ppc_fp128, and the IEEE 754 decimal types (considering decimal literals in lieu of hexadecimal literals).

In the case of x86_fp80, the numbers have an explicit integer bit which, if set incorrectly, results in an invalid value. The behavior of such values currently falls into the–to use Nikita’s phrase from his recent keynote–undecided semantics of LLVM. For pseudo-denormals (where the biased exponent is 0 and the integer bit is 1), the hardware treats it as if it were a noncanonical representation of another finite value; for the other values, the effect is to raise the invalid exception, effectively as if it were a noncanonical sNaN. Whether this is best represented as a trap value (sorry, non-value representation in C23 terms) or noncanonical values is up for debate, but neither of these concepts have clear concepts in LLVM IR at present. To my mind, I don’t see that these are important enough to justify having direct representation as a floating literal in LLVM IR, where a bitcast constant expression could instead be sufficient.

I must confess ignorance of all the precise pitfalls of ppc_fp128; I’m only aware of the broad strokes of this type. It is a pair of two double values, with the second being smaller than the first. I’m not sure what the consequences of invalid values such as the first value being finite and the second being infinite, but I suspect that like x86_fp80, these are unimportant enough to not justify direct representation. However, I am somewhat more concerned by the fact that some of the more valid values do not have concise representation. For example, the pair {DBL_MAX, DBL_MIN} requires an awfully large number of digits to print out, and it may be prudent to retain a hexadecimal integer representation instead for this format.

I’m aware that decimal floating-point types are not in LLVM IR, but I still want to talk about them here since they are one of my motivations for this RFC. These types introduce the concept of decimal cohorts, which are numbers with the same numerical value but different exponents (e.g., 0e0 and 0e1 are in the same cohort but are numerically equivalent). There is a standard convention for determining which of the cohorts to use when converting from a decimal string, so different members of the same cohort can be considered to have different string representations and that’s not a problem. But decimal types, like x86_fp80 and ppc_fp128, have outright noncanonical values (which are distinct from decimal cohorts), for finite values as well as infinities. Note that, for noncanonical values, only one binary encoding is considered “correct”–e.g., the correct encoding for infinity sets the significand value to all 0’s–and any other value is noncanonical and will not be produced as the result of an operation. Finally, decimal types have two different binary representations (the BID and DPD formats), and while the set of valid values is the same for both representations, which bit patterns correspond to those patterns is different, and even which finite values have noncanonical representations differs.

All of this suggests to me that it is neither advantageous nor necessary to have floating literals that represent noncanonical floating-point values, especially as noncanonical value support in APFloat itself is spotty (see. e.g., APFloat: x87DoubleExtended pseudo-NaNs (integer_bit==0) not handled as always-signalling. · Issue #63938 · llvm/llvm-project · GitHub). If we have support for the non-finite literals, then I don’t think there is much reason to retain the weird hexadecimal format we use currently, and we can drop those values, although the issue of non-concise representation for some ppc_fp128 values gives me some pause.

On string-to-float conversions

The LangRef currently states

The assembler requires the exact decimal value of a floating-point constant. For example, the assembler accepts 1.25 but rejects 1.3 because 1.3 is a repeating decimal in binary.

This statement is flat-out wrong; no such check exists for double types, and other floating-point types check the exactness of conversion from double to that type, so if the conversion to double is not exact but the conversion to float is, the constant is considered valid.

In principle, string-to-float conversion can result in three exceptions: inexact, overflow (e.g., 1e999999), and underflow (e.g., 1e-99999999). I would propose that we make any exception that occurs on converting a string to the desired floating-point type cause a parser error. Presently, there seem to be 230 failures in the LLVM test suite if I do this. There are 6 failures for non-double types, those that are presently taking advantage of only the string-to-double inexactness of conversion.

The downside of this change is that it makes a value like double 3.14159 illegal to write, since the fully exact decimal value is rather tedious to write. I can see people desiring that inexact conversions be legal, and am willing to make these legal if desired, but I do think that underflow and overflow (e.g., 1e999999) should be a parse error for a floating-point literal in that case.

Summary of proposed syntax for floating-point literals

This is a summary of all of the possible ways to express a floating-point literal, both old and new, that I propose:

[+-]?\d+[.]\d*([eE][+-]?\d+)? i.e., the current floating-point literal syntax. Note that the decimal point is necessary, but trailing digits and the exponent field are option.
[+-]?0x[0-9a-fA-F]+[.][0-9a-fA-F]*([pP][+-]?\d+)? the C syntax for hexadecimal floating-point literals, except that the decimal point is again necessary.
pinf, ninf Positive and negative infinity
nan The preferred qNaN value, i.e., sign is positive, quiet bit is set, rest of the payload is all 0’s.
nan(0[xX][0-9a-fA-f]+) The NaN value having a given payload, which should be specified in hexadecimal
bitcast (i64 0xabcdef09 to double) Technically not a floating-point literal, but this is how noncanonical values should end up getting spelled.