[RFC] Profile Information Propagation Unittesting (original) (raw)

Contributors: @xur @chandlerc @snehasish

Problem

Profile-guided optimization (PGO, aka feedback-directed optimization, FDO) is one of the main tools we have for improving the performance of generated code. Maintaining the profile information throughout compilation passes is critical to its effectiveness. For example, if a pass mutates the CFG but does not update, or accidentally removes, profile information, passes downstream will deal with degraded profile information, despite analyses like BFI “trying to do their best” to “guess” the profile - the information is gone.

This is further complicated because, in some cases, some profile information genuinely cannot be synthesized by the pass. For example: SimplifyCFGOpt::SimplifyBranchOnICmpChain may slit a predicate into an early check, which only takes part of that predicate (IIUC, something like if (A && B) then… else <not block>) becomes if A then <...> else <not block>) - but we won’t know the edge probabilities for the new edges resulting from the “just check A” condition; also, see D159322, D158642 or D157462.

We can’t just rely on today’s unit testing, because it requires both patch author and patch reviewers to know and care about profile information, and also check the values make sense.

Proposal

The proposal is to enhance today’s unit testing, in 2 phases. The second phase is more exploratory, so this RFC mentions it mostly as food for thought and invitation for suggestions. The first phase, however, should be fairly straightforward and provide a good chunk of value fairly easily.

This is complementary to other efforts like those discussed in D158889, that are about integration testing, and probably require more involved infrastructure, and is meant (the proposal here) as a “first line of defense”.

Phase 1: just check profile info isn’t dropped

  1. Passes that create edges with unknowable weights must explicitly indicate that using a new metadata attached to the terminator, “MD_prof_unknown” (final name TBD).
  2. Have an option enabling a profile verifier that’s run as part of opt and llc (Note: only if llc takes IR as input, afaik can’t capture profiles in MIR input). After the module is loaded by opt/llc, the verifier injects profile information into the IR, if it’s not available, before any passes are run; then after they are run, the verifier checks profile information is still present.

To stress this aspect: the verifier only checks that non-trivial terminators have either MD_prof or MD_prof_unknown attached to them at the end of the test. It doesn’t check for correct values.

There are 3 kinds of tests, for our purposes, under llvm/test: (1) module doesn’t have any profile information whatsoever (vast majority); (2) module comes with profile info metadata (~360 files), and (3) profile info is loaded (~50, sample + instrumented). (OK, I suppose there could be a “4th”: a mix of 2 and 3).

For the relatively few that have some profile embedded, we can start by skipping them from this verification (easy check: “if module has profile, don’t insert more, and don’t verify at the end”); and the very few that ingest profiles, we can manually exclude them. Later, we can figure out if it’s worth doing something special for these, but likely the functionality they test isn’t likely to have the problem we’re trying to detect (accidentally dropping profiles).

For the rest, we could insert the synthetic weights currently computed by profile analyses, or bias them. Since the goal at this stage is just checking profile info isn’t dropped, the synthetic ones are sufficient.

We would roll this out gradually:

Observations

Phase 2: check profile info “still makes sense”

To check the profile information still “makes sense” - i.e. for example we did keep profile metadata, but forgot to flip probabilities for a conditional when we flipped the condition - we can run a second step of control flow integrity checks and accept variations within some margin. We’d probably need to use synthetic profiles that are more biased (e.g. 10%-90% probabilities for conditionals).

Like mentioned in introduction, this phase is mentioned here as food for thought.