LLVM optimizations during PGOs (original) (raw)

February 7, 2025, 11:55pm 1

Hi ,

I am working in PGOs . How can I check what optimizations have been applied for PGO? What optimizations are using profile information to run? How can I get the names of optimizations that applied during PGO.

Any help is appreciated.

I don’t believe there any additional optimization passes that get enabled when PGO profiles are available. It’s simply existing passes now have access to additional data to make more informed decisions. Take register allocation as an example, profile data allows for more accurate cost modelling of spill weights (try and keep variables in hot blocks in registers). The best way to find passes that use PGO information would probably be to grep around the codebase/use an IDE to find consumers of the BlockFrequencyInfo and BranchProabilityInfo analyses.

There will be a couple additional passes added depending upon what PGO mode you are using, especially when doing an instrumented build. These passes handle adding instrumentation though (setting up counters, incrementing counters, lowering intrinsics) rather than actually using any of the information though.

Finding the exact pass pipeline used can be done with -mllvm -print-pipeline-passes on clang (or the equivalent in whatever frontend you are working with.

nikic February 8, 2025, 8:51am 3

We do have a number of passes that only run if PGO is enabled (or, more precisely, they always run, but are no-ops without PGO). Two examples are ControlHeightReduction and LoopSink.

soma_p February 8, 2025, 2:03pm 4

Thanks for your response.

Is there any way to see what passes are executing and what passes are not executing while running the benchmark program on Clang?
Also , in the code base, what files are the decision makers running the programs during PGO .

Thanks for your help .

Ah, didn’t realize that. Thanks for the correction.

You can see what passes change the IR with -mllvm -print-changed on clang (or see the IR after every pass with -mllvm -print-after-all, but that won’t tell you what actually gets executed. Technically all the passes (even the ones that skip running due to a lack of profile information) are executing, they just might be noops. I don’t think there’s any standard reporting mechanism in LLVM for these passes to advertise that they are not doing anything due to a lack of profile information.

soma_p February 14, 2025, 10:24pm 6

I have checked the BlockFrequencyInfo and BranchProbabilityInfo files, and I understand these are analysis passes. I am researching on PGO (Profile-Guided Optimization)** and other optimization techniques.

When collecting profiles, we obtain information such as function execution counts and the number of basic blocks executed. However, I believe these are not the only pieces of information collected in the profile. The "instrumented profile generated using -fprofile-generate is passed to the analysis module, which further derives additional profile data, such as:
-Path/Edge Profiles
-Call-Site Execution Counts
-Instruction Execution Counts
-Data Variable Access Counts
may be other profiling metrics as well.

After processing, the collected profile data is summarized into a “profile summary”, which is then used for optimizations

My Questions:

What profiling counters does Clang inserts in final profile summary and how can I get the final profile summary ?
How many optimization/transformation passes in LLVM use profile data for optimizations?

I am also looking at the llvm code base, and I got a very initial idea about the workflow, but If anyone provides me with the correct workflow - How the profile data is read, then based on the information in the profile , optimizations apply .

Any help is appreciated.

ellishg February 18, 2025, 5:11pm 7

At a high level, -fprofile-use injects code to track block counts for a subset of a function’s blocks. In PGOInstrumentation.cpp, this profile data is used to determine branch probabilities for all branches in a function. In the LLVM tests, you’ll see that the !prof metadata record holds this info.

This metadata is used by BranchProbabilityInfo.h to compute the probability a branch is taken for each branch. If profile data is not available, I believe this will be inferred from the CFG.

Then, this data is used by BlockFrequencyInfo.h to compute each block frequency.

LLVM passes can use this data with
(Machine?)BranchProbabilityInfo::getEdgeProbability() and (Machine?)BlockFrequencyInfo::getBlockFreq(). It’s really tough to see how profile data impacts optimization because these functions are used throughout LLVM. Sometimes there are flags to disable profile usage for a particular pass, but not always.

ellishg February 18, 2025, 5:13pm 8

Oh, and I just remembered that there are in fact passes that only run when PGO is used.

To reduce the instrumentation overhead, a preinliner pass is run when -fprofile-generate or -fprofile-use is used, but this can be disabled via the -disable-preinline LLVM flag.

ellishg February 18, 2025, 5:16pm 9

Even if -fprofile-use isn’t used, I think BPI infers probabilities from the CFI. Do you know if these passes are still no-ops in this case?

nikic February 18, 2025, 5:19pm 10

Yes. Generally, we only tend to use of BPI/BFI in the presence of a profile summary. Inlining being the main exception to that.

soma_p February 19, 2025, 6:11pm 11

HI ,

This is a great help. Thank you for your response.

Is there any documentation from llvm for what profiling counts are inserted by Clnag ?
Also how many optimizations/ transformation passes use profile data ?

From what I understood , all the optimizations run during profiling, but some special optimizations also run those that are mentioned in passbuilderpipeline.cpp with flag.
I even ran with -passes= Structure and got a list of all the passes. I saw all the passes are run, but not all passes have an effect on profiling. However, I want to know what transformation passes use profile data?

Is there any developer’s note / Documentation from llvm .