Shared library compiled with -ffast-math modifies FPU state (original) (raw)

On linux x86_64

touch foo.c clang -Ofast -fpic -shared foo.c -o foo.so objdump --disassemble foo.so

gives:

Disassembly of section .text:

00000000000004f0 : 4f0: 0f ae 5c 24 fc stmxcsr -0x4(%rsp) 4f5: 81 4c 24 fc 40 80 00 orl $0x8040,-0x4(%rsp) 4fc: 00 4fd: 0f ae 54 24 fc ldmxcsr -0x4(%rsp) 502: c3 retq 503: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 50a: 00 00 00 50d: 0f 1f 00 nopl (%rax)

This means that any thread which later loads the library will then set the flush subnormals to zero (FTZ) and subnormals are zero (DAZ) flags (even if the executable itself was not compiled with -ffast-math).