Help with Segmentation Fault from Custom Pseudo-Instruction Optimization on RISC-V(Xalancbmk)) (original) (raw)

Hi all,

I’m working on a custom optimization in the LLVM backend for RISC-V. Specifically, the goal is to combine sequences of simple loads (non-volatile, non-indexed) from the same base pointer with constant offsets (within the same cache line) into a single pseudo-instruction. Packing into pseudo instructions is done during Instruction select, and I unpack after the scheduler. I am unpacking the pseudo instruction into the original pair of instructions.

This transformation is being tested using the SPEC CPU 2017, Pseudo works fine for all SPECspeed®2017 Integer suite except xalancbmk benchmark, compiled with clang and run on a custom RISC-V core implemented on an FPGA. While the baseline (-03) code works correctly, applying this optimization results in a segmentation fault during runtime. The fault appears when the program tries to load from a nullptr (ld a0, 0(a2) here a2 is zero) in the constructor of XalanVector<XalanDOMString>.

What I’ve done:

My suspicion:

It seems like one of the fused loads is aliasing with a store or other memory-modifying operation between them, or some register used in computing the base address gets overwritten in between. I’d like to implement an alias check or memory dependency check before I pack them into a pseudo to prevent fusing loads that are not safe.

Questions:

  1. How can I conservatively check for aliasing or unsafe intervening memory operations between two loads during Instruction selection?
  2. Is there an existing utility or pass in LLVM that helps determine memory dependency or hazard safety in DAG nodes?

Any guidance on best practices or how to debug such backend issues more systematically would be greatly appreciated. I can provide disassembly and relevant LLVM IR or DAG output if needed.

Thanks in advance!
Ravikiran

lukel May 6, 2025, 9:55am 2

There is a known issue that xalancbmk has undefined behaviour in SPEC CPU 2017, and a recent InstCombine patch recently began to optimise it which led to crashes, see New RISC-V failure on SPEC CPU 2017 523.xalancbmk_r · Issue #136395 · llvm/llvm-project · GitHub

I believe that patch has been reverted since so it may go away if you try pulling from upstream.

Hey @lukel, Thank you for your quick response. does this happen to 502.gcc_r as well?

lukel May 8, 2025, 9:40am 4

We didn’t detect anything on 502.gcc_r at least on our configuration unfortunately

topperc May 8, 2025, 2:22pm 6

Are you compiling 502.gcc_r with -fno-strict-aliasing?

Yes, I am compiling with -fno-strict-aliasing flag.