RFC:Proper handing of -mnan=legacy on MIPS (original) (raw)

December 4, 2025, 3:18am 1

HI,

Background and Motivation

In the MIPS architecture, there are two modes for encoding floating-point NaN (Not a Number):

· legacy mode: Traditional NaN encoding, where the quiet bit position differs from IEEE 754-2008.
· 2008 mode: NaN encoding compliant with the IEEE 754-2008 standard.

LLVM/Clang currently uses the -mnan=legacy and -mnan=2008 options to specify which NaN encoding to use. However, support for -mnan=legacy is incomplete in the actual codebase, leading to bugs (e.g., #100495) and inconsistencies.

Key points from recent reviews highlight the problem:

· RKSimon: “It is not acceptable to have target specific code like this in the generic DAG combines.”

· arsenm: “LLVM as a whole does not support the legacy mips nan encoding… We should probably just turn the flag into an error; it would be a huge undertaking to actually support it correctly”

· nikic: “What you do here is going to be a very partial solution, because we have tons of code that assumes IEEE NaNs… We should probably also consider making -mnan=legacy an error and stop pretending that we support it.”

Detailed Design

· Case A (Architecture supports nan2008): For CPUs/architectures that support the IEEE 754-2008 NaN encoding (e.g., mips32r2, mips64r2 and later), the -mnan=legacy option is silently converted to -mnan=2008. A warning is emitted: warning: ignoring unsupported ‘-mnan=legacy’ option and instead set to ‘-mnan=2008’ because the ‘%0’ architecture supports it.

· Case B (Architecture does NOT support nan2008): For older architectures (e.g., mips1, mips32, mips64), the -mnan=legacy option is ignored. A warning is emitted: warning: ignoring ‘-mnan=legacy’ option because the ‘%0’ architecture does not support it.

Discussion Points

Does the community agree with the proposed behavior of converting -mnan=legacy to -mnan=2008 with a warning on supported architectures?

References
· issue #100495: Original issue with incorrect NaN encoding
· pr #153777: Implementation of automatic conversion for -mnan=legacy

cc: @nikic

Thanks,

Ying

pinskia December 4, 2025, 3:31am 2

I should mention that sh fp also uses a swapped qnan/signaling bit too.

arsenm December 4, 2025, 10:18am 3

I think just erroring would be better than ignoring the option. If someone is explicitly requesting the legacy behavior, I would expect an error if it isn’t really supported.

pinskia December 4, 2025, 10:59am 4

Legacy is misnomer here too.

Even the octeon 3 fpu used the legacy format.

jyknight December 9, 2025, 11:32pm 5

I’m not really a MIPS expert (nor a MIPS user), but from my understanding of the situation, I think emitting a warning and then just building with -mnan=2008 is not viable.

Building for the wrong nan mode won’t just work – the choice of -mnan=legacy vs -mnan=2008 results in a flag in the ELF object file header, and the linker (and subsequently dynamic loader) enforces that it is consistent across all objects/shared-libraries. Switching the mode will thus likely cause a link failure or a fail to load.

Even if your OS distribution provided both varieties of libraries (afaik none do so?), the Linux kernel will refuse to load an ELF file built for a mode that is inconsistent with the hardware you’re running on. While the spec allows for hardware which can run in either mode, it’s my understanding that all extant hardware hard-wires that bit to either on or off, so in practice it’s an immutable characteristic of the hardware. (Except on systems lacking any hardware FPU; the kernel’s FPU emulator can support both modes)

So, the correct answer is that LLVM doesn’t currently support “-mnan=legacy” hardware, and we should refuse to build for that target. In practice, however, I suspect there is a lot of code which doesn’t really care about the incorrectly-implemented NaN semantics, and so we would break any users currently building for such targets, despite it “apparently working” today. Unless the number of such users is miniscule, that seems like maybe something we wouldn’t want to do, despite it being correct.

yingopq December 10, 2025, 3:00am 6

Thanks for your replies. I accept your suggestions and will not convert but directly report an error.

jyknight December 10, 2025, 8:45pm 7

It’s not obviously an improvement to switch from “known floating-point codegen bugs that nobody is currently working on fixing” to “cannot target a given platform _at all_”.

So before making this into an error, I would like to hear from stakeholders of the MIPS target that are OK with it being impossible to compile targeting a MIPS legacy-nan platform in LLVM.

nikic December 11, 2025, 7:41am 8

Can we keep -mnan=legacy and document somewhere (where?) that issues relating to SNaN encoding under -mnan=legacy are explicitly Won’t Fix?

I don’t really see good alternatives here beyond either a) dropping -mnan=legacy and thus making the Mips backend largely useless or b) investing substantial engineering effort to correctly support the different NaN encoding for a niche architecture. (Unless someone has an idea of how we can support different NaN encodings without significant changes…)

jyknight December 11, 2025, 3:50pm 9

Is it really “won’t fix”? IMO, this should be considered “ought to be fixed, but nobody is interested enough to work on it at the moment”.

I think you’re right that it’ll be substantial effort, but it’s not infeasible or conceptually difficult. “Just” a bunch of plumbing work…

I think APFloat interface changes will be most tricky, primarily because the current API is a mess, so it’ll require some API cleanup, as a prerequisite. But, that might be nice to do anyways…

Right now there’s two different ways of representing float kind…both a Semantics enum, as well as a pile of fltSemantics global constants. Neither of these admit the possibility of “options” for a type. The obvious thing to do in the current API is to simply add a new “IEEEdoubleMIPSNaN” fltSemantics. But trying to use that would break all the existing code checking things like if (&val.getSemantics() == &APFloat::IEEEDouble()) (which there’s a ton of). So, probably ought to migrate to API with a “float kind” accessor, switch all current code to use that, and eliminate the global-object-address-comparison weirdness.

Anyhow, I don’t know if there’s enough people interested in MIPS legacy-nan to actually do this work, but if there are, I wouldn’t want to discourage them from doing so. (After all, LLVM implemented PPC DoubleDouble support for APFloat, and that’s much weirder than an inverted nan quiet-bit!)

nikic December 11, 2025, 4:07pm 10

I think it’s actual won’t fix, because things like

are going to have non-trivial negative effects on the code base as a whole.

This does not seem like the kind of change that can be hidden away in a quiet corner where it only affects people who care about Mips. It’s the kind of change that will pervade everywhere. That’s not a price I’m willing to pay.

While the format is a good bit weirder, it’s also pretty cleanly encapsulated. It’s a completely separate ppc_fp128 type, with a completely separate APFloat implementation. It’s not something for which we have to pass around flags all over the place.

(Of course, we could also have separate “float_legacy_snan” types, but that’s going to have other costs.)

jyknight December 11, 2025, 4:33pm 11

I mean, maybe? I could certainly be wrong, but it looks to me like it need not be a huge burden – given a prerequisite of an improved APFloat API.