Arithmetic referencing dso_local function causes compilation error on Linux/x64 (original) (raw)
I have encountered an odd issue where LLVM fails to compile arithmetic referencing a local function on Linux/x64.
Here is a minimal (hopefully reproducible) snippet:
; min_test.ll
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
define dso_local void @myFunction() {
ret void
}
define i64 @main() {
%1 = ptrtoint ptr @myFunction to i64
%2 = sub i64 %1, 2147483648 ; = 0x80000000
%3 = lshr i64 %2, 1 ; `add` also fails
ret i64 %3
}
I am compiling on Linux/x64 with:
/llvm-17/bin/clang++ -O1 -c min_test.ll -o min_test.o
This gives the following error:
<unknown>:0: error: value of -2147483671 is too large for field of 4 bytes.
error: cannot compile inline asm
1 error generated.
I have found that:
- It fails with LLVM 15, 16, and 17 (I haven’t tested other versions)
- The snippet fails on Linux/x64, but compiles successfully on MacOS/arm64
- Compilation only fails when
myFunction
is declared asdso_local
. - Interestingly, after adding 1 to the integer constant (2147483649 = 0x80000001), the program compiles successfully.
My expectation is that the pointer to myFunction
would be stored in a register, and arithmetic instructions emitted working on that register. (This is indeed the case when changing the integer constant so it compiles successfully). I am not sure where the field of 4 bytes is coming from.
If anyone has any ideas why this might be happening I would appreciate your thoughts!
Thanks
phoebe July 9, 2024, 2:52pm 2
The error comes from assembler. Note, both GNU and LLVM assembler give the same error: Compiler Explorer
jack.w July 10, 2024, 3:53am 3
Thank you!
I have run using debug LLVM 17 with logging enabled. It looks like the EarlyCSE pass simplifies the arithmetic to a ConstantExpr:
EarlyCSE Simplify: %1 = ptrtoint ptr @myFunction to i64 to: i64 ptrtoint (ptr @myFunction to i64)
EarlyCSE Simplify: %1 = sub i64 ptrtoint (ptr @myFunction to i64), 2147483648 to: i64 sub (i64 ptrtoint (ptr @myFunction to i64), i64 2147483648)
EarlyCSE Simplify: %1 = lshr i64 sub (i64 ptrtoint (ptr @myFunction to i64), i64 2147483648), 1 to: i64 lshr (i64 sub (i64 ptrtoint (ptr @myFunction to i64), i64 2147483648), i64 1)
SelectionDAG has 11 nodes:
t0: ch,glue = EntryToken
t14: i64 = X86ISD::WrapperRIP TargetGlobalAddress:i64<ptr @myFunction> 0
t12: i64 = add t14, Constant:i64<-2147483648>
t6: i64 = srl t12, Constant:i8<1>
t9: ch,glue = CopyToReg t0, Register:i64 $rax, t6
t10: ch = X86ISD::RET_GLUE t9, TargetConstant:i32<0>, Register:i64 $rax, t9:1
Then instruction selection maps to an LEA instruction, which fails to assemble as before:
===== Instruction selection ends:
Selected selection DAG: %bb.0 'main:'
SelectionDAG has 13 nodes:
t0: ch,glue = EntryToken
t12: i64 = LEA64r Register:i64 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>r</mi><mi>i</mi><mi>p</mi><mo separator="true">,</mo><mi>T</mi><mi>a</mi><mi>r</mi><mi>g</mi><mi>e</mi><mi>t</mi><mi>C</mi><mi>o</mi><mi>n</mi><mi>s</mi><mi>t</mi><mi>a</mi><mi>n</mi><mi>t</mi><mo>:</mo><mi>i</mi><mn>8</mn><mo><</mo><mn>1</mn><mo>></mo><mo separator="true">,</mo><mi>R</mi><mi>e</mi><mi>g</mi><mi>i</mi><mi>s</mi><mi>t</mi><mi>e</mi><mi>r</mi><mo>:</mo><mi>i</mi><mn>64</mn></mrow><annotation encoding="application/x-tex">rip, TargetConstant:i8<1>, Register:i64 </annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8778em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">r</span><span class="mord mathnormal">i</span><span class="mord mathnormal">p</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal" style="margin-right:0.13889em;">T</span><span class="mord mathnormal">a</span><span class="mord mathnormal" style="margin-right:0.02778em;">r</span><span class="mord mathnormal" style="margin-right:0.03588em;">g</span><span class="mord mathnormal">e</span><span class="mord mathnormal" style="margin-right:0.07153em;">tC</span><span class="mord mathnormal">o</span><span class="mord mathnormal">n</span><span class="mord mathnormal">s</span><span class="mord mathnormal">t</span><span class="mord mathnormal">an</span><span class="mord mathnormal">t</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">:</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6986em;vertical-align:-0.0391em;"></span><span class="mord mathnormal">i</span><span class="mord">8</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel"><</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6835em;vertical-align:-0.0391em;"></span><span class="mord">1</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">></span></span><span class="base"><span class="strut" style="height:0.8778em;vertical-align:-0.1944em;"></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal" style="margin-right:0.00773em;">R</span><span class="mord mathnormal">e</span><span class="mord mathnormal" style="margin-right:0.03588em;">g</span><span class="mord mathnormal">i</span><span class="mord mathnormal">s</span><span class="mord mathnormal">t</span><span class="mord mathnormal" style="margin-right:0.02778em;">er</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">:</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6595em;"></span><span class="mord mathnormal">i</span><span class="mord">64</span></span></span></span>noreg, TargetGlobalAddress:i32<ptr @myFunction> -2147483648, Register:i16 $noreg
t6: i64,i32 = SHR64ri t12, TargetConstant:i8<1>
t9: ch,glue = CopyToReg t0, Register:i64 $rax, t6
t16: i32 = Register $noreg
t10: ch = RET TargetConstant:i32<0>, Register:i64 $rax, t9, t9:1
Adding 1
to the constant causes the instruction selection to prefer LEA
then ADD
, which compiles successfully:
===== Instruction selection ends:
Selected selection DAG: %bb.0 'main:'
SelectionDAG has 16 nodes:
t0: ch,glue = EntryToken
t14: i64 = LEA64r Register:i64 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>r</mi><mi>i</mi><mi>p</mi><mo separator="true">,</mo><mi>T</mi><mi>a</mi><mi>r</mi><mi>g</mi><mi>e</mi><mi>t</mi><mi>C</mi><mi>o</mi><mi>n</mi><mi>s</mi><mi>t</mi><mi>a</mi><mi>n</mi><mi>t</mi><mo>:</mo><mi>i</mi><mn>8</mn><mo><</mo><mn>1</mn><mo>></mo><mo separator="true">,</mo><mi>R</mi><mi>e</mi><mi>g</mi><mi>i</mi><mi>s</mi><mi>t</mi><mi>e</mi><mi>r</mi><mo>:</mo><mi>i</mi><mn>64</mn></mrow><annotation encoding="application/x-tex">rip, TargetConstant:i8<1>, Register:i64 </annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8778em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">r</span><span class="mord mathnormal">i</span><span class="mord mathnormal">p</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal" style="margin-right:0.13889em;">T</span><span class="mord mathnormal">a</span><span class="mord mathnormal" style="margin-right:0.02778em;">r</span><span class="mord mathnormal" style="margin-right:0.03588em;">g</span><span class="mord mathnormal">e</span><span class="mord mathnormal" style="margin-right:0.07153em;">tC</span><span class="mord mathnormal">o</span><span class="mord mathnormal">n</span><span class="mord mathnormal">s</span><span class="mord mathnormal">t</span><span class="mord mathnormal">an</span><span class="mord mathnormal">t</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">:</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6986em;vertical-align:-0.0391em;"></span><span class="mord mathnormal">i</span><span class="mord">8</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel"><</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6835em;vertical-align:-0.0391em;"></span><span class="mord">1</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">></span></span><span class="base"><span class="strut" style="height:0.8778em;vertical-align:-0.1944em;"></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal" style="margin-right:0.00773em;">R</span><span class="mord mathnormal">e</span><span class="mord mathnormal" style="margin-right:0.03588em;">g</span><span class="mord mathnormal">i</span><span class="mord mathnormal">s</span><span class="mord mathnormal">t</span><span class="mord mathnormal" style="margin-right:0.02778em;">er</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">:</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6595em;"></span><span class="mord mathnormal">i</span><span class="mord">64</span></span></span></span>noreg, TargetGlobalAddress:i32<ptr @myFunction> 0, Register:i16 $noreg
t11: i64 = MOV64ri TargetConstant:i64<-2147483649>
t12: i64,i32 = ADD64rr t14, t11
t6: i64,i32 = SHR64ri t12, TargetConstant:i8<1>
t9: ch,glue = CopyToReg t0, Register:i64 $rax, t6
t16: i32 = Register $noreg
t10: ch = RET TargetConstant:i32<0>, Register:i64 $rax, t9, t9:1
Is it expected that LLVM may sometimes produce invalid assembly given valid IR code? Or would this be considered a bug in instruction selection?
topperc July 10, 2024, 4:34am 4
Adding -code-model=large
will make it work. Without that it’s generating a 32-bit pc-relative relocation and the final offset is too large for 32 bits.
MaskRay July 11, 2024, 7:02am 5
The issue isn’t specific dso_local. You can reproduce it with a local linkage symbol
// llc -O1
define internal void @myFunction() {
ret void
}
define i64 @main() {
%1 = ptrtoint ptr @myFunction to i64
%2 = sub i64 %1, 2147483648 ; = 0x80000000
%3 = lshr i64 %2, 1 ; `add` also fails
ret i64 %3
}
The issue resembles previous offset folding issues ⚙ D73606 [X86] matchAdd: don't fold a large offset into a %rip relative address and ⚙ D93931 [X86] Don't fold negative offset into 32-bit absolute address (e.g. movl $foo-1, %eax) .
jack.w July 12, 2024, 1:06am 6
Thank you for looking into this @MaskRay !
tyker July 12, 2024, 9:49pm 7