[LLVMdev] RFC: A plan for stateful alias analysis in LLVM (original) (raw)

Hal Finkel hfinkel at anl.gov
Thu Jul 16 13:35:29 PDT 2015


----- Original Message -----

From: "Pete Cooper" <petercooper at apple.com> To: "Hal Finkel" <hfinkel at anl.gov> Cc: "Chandler Carruth" <chandlerc at google.com>, "Daniel Berlin" <dannyb at google.com>, "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> Sent: Thursday, July 16, 2015 3:32:43 PM Subject: Re: [LLVMdev] RFC: A plan for stateful alias analysis in LLVM

> On Jul 16, 2015, at 12:36 PM, Hal Finkel <hfinkel at anl.gov> wrote: > > ----- Original Message ----- >> From: "Chandler Carruth" <chandlerc at google.com> >> To: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>, "Hal >> Finkel" <hfinkel at anl.gov>, "Daniel Berlin" >> <dannyb at google.com> >> Sent: Monday, July 13, 2015 9:59:10 PM >> Subject: RFC: A plan for stateful alias analysis in LLVM >> >> >> >> # Front matter # First, I want to emphasize that this is not a >> short-term plan. This is a long-term plan. When I was looking at >> refactoring the existing Alias Analysis infrastructure in LLVM in >> ways that would break stateful AA, Hal Finkel asked me to figure >> out >> how things would work with the new pass manager and make sure that >> was sane. >> >> >> This plan assumes the new pass manager. There is nothing deeply or >> intrinsically tied to it, but the pieces will fit together >> naturally >> there, and I don't see how to make them fit naturally in the >> existing pass manager. I've not invested a lot of time in figuring >> that out as it seems a sunk cost. >> >> >> # Core ideas of what stateful AA should look like: # >> 1) An alias analysis is an analysis pass, and if it needs to >> build >> up state, it must do so for a "unit" of IR at a time (using the >> new >> PM's terminology). >> 2) Per-IR-unit invalidation from the pass manager is the primary >> mechanism to clear out stale (and thus invalid) state from any >> alias >> analysis. That means that most passes will not preserve AA by >> default (unlike today). >> 3) To be reasonably scalable, most alias analyses which retain >> state >> will need to handle these invalidation events in a way that >> gracefully degrades their results rather than wiping out all >> state. >> >> >> I think everything in-tree but GlobalsModRef fits these >> requirements >> already. Notably, this will work well for CFL, Andersen, or >> generally any "intersection" based AA. A "union" based AA could >> work >> here, but would have to do considerable work to be able to >> tolerate >> updates that "invalidate" its state without simply forced >> recomputation. I think this is a reasonable tradeoff as it >> preserves >> our ability to experiment, while simplifying the update model. >> >> >> The rest of this email contains much more detailed musing on this. >> Feel free to stop reading when necessary. >> >> >> The last section touches on the core functionality that would be >> needed to support something like GlobalsModRef in its current >> form. >> I'm much less confident in this part, so it's not part of my core >> requirements above. >> >> >> # Details about analysis passes and pass manager invalidation # >> My thought is that an alias analysis pass in the new model would >> be >> an analysis pass. It would produce an analysis result that >> contained >> any state necessary to vend AA queries. This result would be >> cached >> just like all the other analysis results are in the pass manager. >> It >> would have the same invalidation rules too. >> >> >> A really important consequence of this is that analysis results in >> the new scheme get notified of invalidation events and can attempt >> to update their internal state to avoid becoming invalid. AAs >> can >> leverage this to do fast incremental updates to their internal >> structures if appropriate. >> >> >> An AA could even use ValueHandles now to automatically stay >> up-to-date with a conservatively correct data structure, and just >> drop all the invalidation requests. In fact, I suspect many >> stateful >> AAs will take exactly this approach. >> >> >> # How will users of AA information query it? # >> This will work much like the normal analysis queries. I plan to >> provide an alias analysis manager at each IR unit that will >> provide >> a single query interface for code to use. This layer can then be >> configured to query any AnalysisManager for an IR unit T for >> the >> relevant actual AliasAnalysis implementations. >> >> >> The AA manager layer will be responsible for the delegation that >> is >> currently done through chaining of AA implementations. It will >> also >> be responsible for enforcing that queries at a particular layer >> don't violate IR unit abstraction boundaries. For example, you >> cannot query a FuntionAAManager for alias information of two >> Values >> from two different functions. > > One thing I'd like to clarify is what is responsible for > 'requiring' the various alias-analysis pass, so that they'll be > recomputed if they're invalidated. So if I have an AliasManager, > and I've requested that it manage a BasicAA and an SCEVAA (for > example), and something happens such that SCEVAA is invalidated, > then before future AA queries are processed, the AliasManager is > responsible for triggering the re-computation of SCEVAA as part of > the chaining process (or before the first AA query). That's my > expectation anyway; does that match this model? That could get expensive if we keep recomputing AA we end up not needing. We might need to try find a way to be lazier about it.

I don't disagree, but to be clear, I'm envisioning some version of SCEVAA that is otherwise efficient enough to be used within the standard pipeline.

-Hal

Take TBAA for example, if it required any kind of computation, it would be better to delay it until we actually query AA on a load/store which has the metadata. I’m pretty sure TBAA doesn’t have state to recompute, but imagine it did and hopefully what i said makes sense. Cheers, Pete > > Thanks again, > Hal > >> >> >> # Why GlobalsModRef doesn't fit, and what we could do about it # >> There is a fundamental construct in GlobalsModRef that exists >> today, >> and seems somewhat useful, but which doesn't fit the above: using >> a >> global analysis of which memory locations are not escaped by any >> function, in order to conclude that within a given function, >> arguments, returns, and loads from globals cannot produce an >> aliasing memory location. Thus, if you always directly index or >> manipulate a global within a function, and it never escapes that >> function (including through integer hacks), you know that some >> other >> function can't produce an aliasing location by loading a pointer >> from a global or receiving a pointer in an argument. >> >> >> In practice this is actually totally reasonable. Passes don't >> run >> around escaping pointers just because they feel like it. However, >> we >> don't have a good way to really formally reason about such things. >> >> >> My suggested way to reason about it is to define a new constraint >> on >> transformation passes over an IR unit T (ie, Function, Module, >> etc.). This constraint is that a pass may not observably escape >> a >> pointer value P from the unit being transformed. Now, "observably" >> is the key word here. The goal is to still allow fundamental >> transformations like Reg2Mem and instrumentation passes. However, >> the guarantee must be that the passes do not introduce new ways to >> observe an escape of a pointer. Because any actual escapes cannot >> be >> observed, it is safe for them to happen and for some other >> function >> pass (for example) to transform based on the non-escaping property >> computed prior to the transformation. I think this is a >> reasonable >> model, but I've not really had enough time thinking about it to be >> 100% certain. >> >> >> With such a model, we could regain most of the power of >> GlobalsModRef >> w.r.t. non-escaping pointers and interprocedural aliasing >> inference. >> We would change it to be a two-pronged analysis. On one hand, a >> local analysis would have to prove that one location came from a >> module-level memory location (global) and the other location came >> from some dynamic input to the function (argument, return, or a >> load >> of memory that is def'ed outside the function), and then the other >> hand would provide the interprocedural information about what >> global >> memory locations were non-escaping. >> >> >> >> >> Ok, those are all the thoughts I have currently about AA, and >> where >> I'm headed. This should hopefully provide context for some of the >> upcoming patches. >> >> >> I'm also going to write a separate email about specific tactical >> problems I've found with GlobalsModRef today (nothing to do with >> the >> above theoretical problems) and how I plan to approach them. >> >> >> Thanks, >> -Chandler > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory _> ________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory



More information about the llvm-dev mailing list