[RFC] Require real or annotated source locations on all instructions (original) (raw)

Currently, we (Sony) are undertaking work to prevent unnecessarily-dropped source locations in LLVM by adding a new rule for updating debug info in the compiler. The rule adds a requirement that each LLVM instruction created must either be assigned a source location (almost always copying from one or more existing instructions), or else it must be assigned an annotation that categorises the reason that no source location is appropriate for that instruction.

This rule can be checked using the Debugify tool, allowing us to detect missing source locations without the false positives that Debugify currently detects. The benefit of this is that we can find and fix unintentionally-missing source locations, and that each intentionally-missing source location has that intention explained, such as instructions that have had their locations dropped following the rules for “when to drop an instruction location”, instructions that are compiler-generated, or instructions whose locations we simply cannot or do not know how to represent.

This post is firstly to draw attention to the change to the debug info update rules, and secondly to seek approval/support for adding verification of this rule to LLVM’s CI. Currently we have a script that tests for missing source locations in LLVM, using Debugify and targeting the llvm-test-suite’s CTMark suite. We would like to make this into an LLVM Labs buildbot; we would run this as a nightly bot, as new source location errors appear infrequently, and the Debugify reports can provide a stack trace of LLVM at the point(s) where missing source locations appear, meaning that even with a nightly run it would be quite easy to identify the commit(s) responsible for any new errors so that we can swiftly fix-forward.

In summary, we are looking to add a new form of debug-info-based correctness requirement to the compiler relating to the completeness of source location information. In future, we may add more debug info requirements, for example: a “temporary” annotation conveys that the annotated instruction does not have a location because it is transient and will not appear in the generated object code; this property could be verified at the end of compilation as a form of correctness testing. This all builds towards our long term desire to have more thorough and robust testing of debug info quality in LLVM, to prevent degradation over time and simplify ongoing maintenance.

This is a successor to a previous proposal; noting for the participants in the discussion of that proposal, @dblaikie I think this matches your suggestion of using Debugify for regression-testing, and @nikic this solves the InstCombine issue by moving the verification into an optional check that runs in-between optimization passes, so there’s no issue with deferring the application of a source location to an instruction.

Requiring source locations for all instructions can significantly improve our ability to map hardware sampled profiles (for AutoFDO) and callstack profiles (for MemProf) during profile use. For example, our change in #129960 improved AutoFDO performance on a large internal workload by 0.5% while also improving MemProf coverage by ~5%. This change preserves locations where the merge APIs previously dropped line information.

So we believe addressing the gap where source locations are lost altogether has the potential to improve PGO quality even more. Along the lines of the future requirements that you mention, we are interested in synthetic anchors which profile mapping can use when a source location is unknown (or cannot be known). Speaking on behalf of the compiler opt team at Google, we are supportive of the direction. In the short term, it makes debug info based PGO more reliable and longer term we can build on it to improve coverage.

Can you clarify the plan for enforcing the new rule?

Is the transition plan that 100% coverage this is only guaranteed for a single benchmark that works right now and the contract is that we don’t regress that benchmark?

More or less exactly that, yes - currently all of the projects in the llvm-test-suite CTMark suite compile at -O2 -g without any missing locations (following a series of bug fixes detected by running Debugify against that suite), and the goal would be to treat new missing source locations in that configuration as regressions.
In future we could broaden coverage by testing more optimization passes (e.g. -O3) or building other projects (e.g. stage 2 clang), but CTMark is fast to build and covers a reasonable breadth of the compiler, hence its use as a starting point.

jrtc27 June 11, 2025, 9:21pm 5

What targets would this be enforced for? Do you have an idea how often non-amd64 (or whatever you’re testing) targets are using different code paths (or entire functions / passes) where there are such issues?

What about for CodeGen and beyond? As I understand it this is only about IR-level debug info, not the resulting MIR (or intermediary DAGs in the case of SelectionDAG).

Initially x86, though there’s no reason this couldn’t be tested for other targets as well; I haven’t tried compiling for non-x86 targets yet, so not clear how many new bugs might turn up with other targets. Currently, the testing only covers until CodeGen (i.e. no coverage of ISel or machine passes), but that’s something I’m looking into at the moment - there’s no fundamental limitation, it just requires some extra work.