Arithmetic referencing dso_local function causes compilation error on Linux/x64 (original) (raw)

I have encountered an odd issue where LLVM fails to compile arithmetic referencing a local function on Linux/x64.
Here is a minimal (hopefully reproducible) snippet:

; min_test.ll
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define dso_local void @myFunction() {
    ret void
}
define i64 @main() {
    %1 = ptrtoint ptr @myFunction to i64
    %2 = sub i64 %1, 2147483648    ; = 0x80000000
    %3 = lshr i64 %2, 1    ; `add` also fails
    ret i64 %3
}

I am compiling on Linux/x64 with:

/llvm-17/bin/clang++ -O1 -c min_test.ll -o min_test.o

This gives the following error:

<unknown>:0: error: value of -2147483671 is too large for field of 4 bytes.
error: cannot compile inline asm
1 error generated.

I have found that:

It fails with LLVM 15, 16, and 17 (I haven’t tested other versions)
The snippet fails on Linux/x64, but compiles successfully on MacOS/arm64
Compilation only fails when myFunction is declared as dso_local.
Interestingly, after adding 1 to the integer constant (2147483649 = 0x80000001), the program compiles successfully.

My expectation is that the pointer to myFunction would be stored in a register, and arithmetic instructions emitted working on that register. (This is indeed the case when changing the integer constant so it compiles successfully). I am not sure where the field of 4 bytes is coming from.

If anyone has any ideas why this might be happening I would appreciate your thoughts!

Thanks

phoebe July 9, 2024, 2:52pm 2

The error comes from assembler. Note, both GNU and LLVM assembler give the same error: Compiler Explorer

jack.w July 10, 2024, 3:53am 3

Thank you!
I have run using debug LLVM 17 with logging enabled. It looks like the EarlyCSE pass simplifies the arithmetic to a ConstantExpr:

EarlyCSE Simplify:   %1 = ptrtoint ptr @myFunction to i64  to: i64 ptrtoint (ptr @myFunction to i64)
EarlyCSE Simplify:   %1 = sub i64 ptrtoint (ptr @myFunction to i64), 2147483648  to: i64 sub (i64 ptrtoint (ptr @myFunction to i64), i64 2147483648)
EarlyCSE Simplify:   %1 = lshr i64 sub (i64 ptrtoint (ptr @myFunction to i64), i64 2147483648), 1  to: i64 lshr (i64 sub (i64 ptrtoint (ptr @myFunction to i64), i64 2147483648), i64 1)

SelectionDAG has 11 nodes:
    t0: ch,glue = EntryToken
        t14: i64 = X86ISD::WrapperRIP TargetGlobalAddress:i64<ptr @myFunction> 0
      t12: i64 = add t14, Constant:i64<-2147483648>
    t6: i64 = srl t12, Constant:i8<1>
  t9: ch,glue = CopyToReg t0, Register:i64 $rax, t6
  t10: ch = X86ISD::RET_GLUE t9, TargetConstant:i32<0>, Register:i64 $rax, t9:1

Then instruction selection maps to an LEA instruction, which fails to assemble as before:

===== Instruction selection ends:
Selected selection DAG: %bb.0 'main:'
SelectionDAG has 13 nodes:
    t0: ch,glue = EntryToken
      t12: i64 = LEA64r Register:i64 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>r</mi><mi>i</mi><mi>p</mi><mo separator="true">,</mo><mi>T</mi><mi>a</mi><mi>r</mi><mi>g</mi><mi>e</mi><mi>t</mi><mi>C</mi><mi>o</mi><mi>n</mi><mi>s</mi><mi>t</mi><mi>a</mi><mi>n</mi><mi>t</mi><mo>:</mo><mi>i</mi><mn>8</mn><mo>&lt;</mo><mn>1</mn><mo>&gt;</mo><mo separator="true">,</mo><mi>R</mi><mi>e</mi><mi>g</mi><mi>i</mi><mi>s</mi><mi>t</mi><mi>e</mi><mi>r</mi><mo>:</mo><mi>i</mi><mn>64</mn></mrow><annotation encoding="application/x-tex">rip, TargetConstant:i8&lt;1&gt;, Register:i64 </annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8778em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">r</span><span class="mord mathnormal">i</span><span class="mord mathnormal">p</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal" style="margin-right:0.13889em;">T</span><span class="mord mathnormal">a</span><span class="mord mathnormal" style="margin-right:0.02778em;">r</span><span class="mord mathnormal" style="margin-right:0.03588em;">g</span><span class="mord mathnormal">e</span><span class="mord mathnormal" style="margin-right:0.07153em;">tC</span><span class="mord mathnormal">o</span><span class="mord mathnormal">n</span><span class="mord mathnormal">s</span><span class="mord mathnormal">t</span><span class="mord mathnormal">an</span><span class="mord mathnormal">t</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">:</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6986em;vertical-align:-0.0391em;"></span><span class="mord mathnormal">i</span><span class="mord">8</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">&lt;</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6835em;vertical-align:-0.0391em;"></span><span class="mord">1</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">&gt;</span></span><span class="base"><span class="strut" style="height:0.8778em;vertical-align:-0.1944em;"></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal" style="margin-right:0.00773em;">R</span><span class="mord mathnormal">e</span><span class="mord mathnormal" style="margin-right:0.03588em;">g</span><span class="mord mathnormal">i</span><span class="mord mathnormal">s</span><span class="mord mathnormal">t</span><span class="mord mathnormal" style="margin-right:0.02778em;">er</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">:</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6595em;"></span><span class="mord mathnormal">i</span><span class="mord">64</span></span></span></span>noreg, TargetGlobalAddress:i32<ptr @myFunction> -2147483648, Register:i16 $noreg
    t6: i64,i32 = SHR64ri t12, TargetConstant:i8<1>
  t9: ch,glue = CopyToReg t0, Register:i64 $rax, t6
  t16: i32 = Register $noreg
  t10: ch = RET TargetConstant:i32<0>, Register:i64 $rax, t9, t9:1

Adding 1 to the constant causes the instruction selection to prefer LEA then ADD, which compiles successfully:

===== Instruction selection ends:
Selected selection DAG: %bb.0 'main:'
SelectionDAG has 16 nodes:
    t0: ch,glue = EntryToken
        t14: i64 = LEA64r Register:i64 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>r</mi><mi>i</mi><mi>p</mi><mo separator="true">,</mo><mi>T</mi><mi>a</mi><mi>r</mi><mi>g</mi><mi>e</mi><mi>t</mi><mi>C</mi><mi>o</mi><mi>n</mi><mi>s</mi><mi>t</mi><mi>a</mi><mi>n</mi><mi>t</mi><mo>:</mo><mi>i</mi><mn>8</mn><mo>&lt;</mo><mn>1</mn><mo>&gt;</mo><mo separator="true">,</mo><mi>R</mi><mi>e</mi><mi>g</mi><mi>i</mi><mi>s</mi><mi>t</mi><mi>e</mi><mi>r</mi><mo>:</mo><mi>i</mi><mn>64</mn></mrow><annotation encoding="application/x-tex">rip, TargetConstant:i8&lt;1&gt;, Register:i64 </annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8778em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">r</span><span class="mord mathnormal">i</span><span class="mord mathnormal">p</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal" style="margin-right:0.13889em;">T</span><span class="mord mathnormal">a</span><span class="mord mathnormal" style="margin-right:0.02778em;">r</span><span class="mord mathnormal" style="margin-right:0.03588em;">g</span><span class="mord mathnormal">e</span><span class="mord mathnormal" style="margin-right:0.07153em;">tC</span><span class="mord mathnormal">o</span><span class="mord mathnormal">n</span><span class="mord mathnormal">s</span><span class="mord mathnormal">t</span><span class="mord mathnormal">an</span><span class="mord mathnormal">t</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">:</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6986em;vertical-align:-0.0391em;"></span><span class="mord mathnormal">i</span><span class="mord">8</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">&lt;</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6835em;vertical-align:-0.0391em;"></span><span class="mord">1</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">&gt;</span></span><span class="base"><span class="strut" style="height:0.8778em;vertical-align:-0.1944em;"></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal" style="margin-right:0.00773em;">R</span><span class="mord mathnormal">e</span><span class="mord mathnormal" style="margin-right:0.03588em;">g</span><span class="mord mathnormal">i</span><span class="mord mathnormal">s</span><span class="mord mathnormal">t</span><span class="mord mathnormal" style="margin-right:0.02778em;">er</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">:</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6595em;"></span><span class="mord mathnormal">i</span><span class="mord">64</span></span></span></span>noreg, TargetGlobalAddress:i32<ptr @myFunction> 0, Register:i16 $noreg
        t11: i64 = MOV64ri TargetConstant:i64<-2147483649>
      t12: i64,i32 = ADD64rr t14, t11
    t6: i64,i32 = SHR64ri t12, TargetConstant:i8<1>
  t9: ch,glue = CopyToReg t0, Register:i64 $rax, t6
  t16: i32 = Register $noreg
  t10: ch = RET TargetConstant:i32<0>, Register:i64 $rax, t9, t9:1

Is it expected that LLVM may sometimes produce invalid assembly given valid IR code? Or would this be considered a bug in instruction selection?

topperc July 10, 2024, 4:34am 4

Adding -code-model=large will make it work. Without that it’s generating a 32-bit pc-relative relocation and the final offset is too large for 32 bits.

MaskRay July 11, 2024, 7:02am 5

The issue isn’t specific dso_local. You can reproduce it with a local linkage symbol

// llc -O1
define internal void @myFunction() {
    ret void
}
define i64 @main() {
    %1 = ptrtoint ptr @myFunction to i64
    %2 = sub i64 %1, 2147483648    ; = 0x80000000
    %3 = lshr i64 %2, 1    ; `add` also fails
    ret i64 %3
}

The issue resembles previous offset folding issues ⚙ D73606 [X86] matchAdd: don't fold a large offset into a %rip relative address and ⚙ D93931 [X86] Don't fold negative offset into 32-bit absolute address (e.g. movl $foo-1, %eax) .

Created [X86] Don't fold offsets that are too closer to INT32_MIN in non-large code models by MaskRay · Pull Request #98438 · llvm/llvm-project · GitHub

jack.w July 12, 2024, 1:06am 6

Thank you for looking into this @MaskRay !

tyker July 12, 2024, 9:49pm 7