[llvm-dev] Writing loop transformations on the right representation is more productive (original) (raw)
Uday Kumar Reddy Bondhugula via llvm-dev llvm-dev at lists.llvm.org
Sat Feb 8 10:22:08 PST 2020
- Previous message: [llvm-dev] Writing loop transformations on the right representation is more productive
- Next message: [llvm-dev] Writing loop transformations on the right representation is more productive
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Mon, 3 Feb 2020 at 12:06, Michael Kruse <llvmdev at meinersbur.de> wrote:
Am Do., 30. Jan. 2020 um 04:40 Uhr schrieb Uday Kumar Reddy Bondhugula <uday at polymagelabs.com>: > There are multiple ways regions in MLIR can be viewed, but the more relevant point here is you do have a loop tree structure native in the IR with MLIR. Regions in MLIR didn't evolve from modeling inlined calls - the affine.for/affine.if were originally the only two operations in MLIR that could hold blocks (which in turn are a list of operations as you know) and there wasn't anything by the name region. Later, "a list of blocks" was renamed "region" in order to generalize and unify it with other concepts that could be captured with "ops with regions", one of which is isomorphic to a "just inlined" call as you view it. But that doesn't mean a loop tree doesn't exist as a first class thing in the IR when you have the relevant ops around -- there is a hierarchy.
Thanks for the interesting insights into the development history of MLIR.
>> Regarding the affine dialect, I see the same problem that Polly has when creating a schedule tree representation: A lot of work has to be done to make IR originating from Clang compatible. Everything becomes easy if the front-end can generate an affine dialect out-of-the box. > > Right - but for the purposes of your proposal, this isn't really relevant - for that matter, one could just use the loop.for, loop.if ops if you don't want to leverage affine restrictions. Moreover, with affine.graybox ops, you can always use affine.for/if wherever you have structured loops (otherwise, you would fall back to a flat list of blocks inside the region of the graybox.) While directly generating the affine dialect maximally from the frontend / Clang is one option, the other is to just generate grayboxes with trivial affine.for/if (or just loop.for/if), and then eliminate the grayboxes maximally within MLIR. This way things are reusable across different frontends, and it would be similar to Polly's approach except that you would be dealing with loops/multi-dimensional arrays where possible instead of flat list of CFGs and GEPs. I think there is a relevant difference of whether we come from a high-level code generator and then lift restrictions or from a low-level IR which has to be raised. If starting with the high-level, we will have to bail out on representing things because we cannot ensure the expected semantics of the high-level idioms (while-, do-loops, coroutines, possibly infinite loops, non-returning elements in the loop body, ...) and have to work towards poking holes into the high-level representation that existing passes must me able to handle. When starting with a low-level approach, useful guarantees are added to the representation, but everything can be represented at the beginning. I am not saying that the two approaches cannot meet, but I am afraid that the high-level approach, like Polly, adds many bail-outs making it difficult to use in practice. For instance, we want to apply strip-mining to a loop. Since it does not change the execution order of any body code, it is always valid, yet we'd have to bail out if we cannot guarantee the representation's guarantees. I would like to avoid that. However, I agree that MLIR has the expressiveness requires for hierarchical loop structures. We don't think we need to argue about that. > That's actually not how I read it. Red/green trees was one of the nine items you mentioned in your list and this didn't come out as the central idea in your opening paras, but let's go with this now that it's clearer to me. Indeed, red/green trees (or DAGs) are one one of the ideas to improve loop optimizations, but does justify its use by itself. They happen to be effectively necessary for others in the list (e.g. versioning, profitiability heuristic by cost function, etc...) and the reason why I think the same cannot be done with MLIR. In hindsight, I could have pointed this out more in the original RFC. Note that a hierarchical representation was not an explicit feature in the list. To convince me that MLIR is the better IR for loop optimizations, might show that each of the features enabled by cheap subtree reuse can also be done sufficiently efficient and easily on MLIR (or that a feature is not what would actually would want).
I suspect that you are looking at this as "MLIR vs your proposal", but the point I've been trying to make is to compare "your proposal on MLIR" vs "your proposal on LLVM", i.e., you may want to evaluate the cost/benefit of hosting your approach on MLIR vis-a-vis LLVM while accounting for the fact that you'll need the conversion for LLVM to MLIR and back for the relevant IR units with the former. On this note, in my opinion, MLIR doesn't have a unified/clear strategy for loop optimization yet (even op representation wise); there are different ops/representations that are being pedalled by different folks, and some I believe are redundant or are simply duplicating infrastructure in less powerful ways. I'm saying this because if you choose MLIR to host your stuff, you'll have choices to make on which forms to use.
>> (which I firmly disagree with being close to MLIR's in-memory representation), not the textual format. > > "close" isn't the right word here, "closer" is! Would you agree that the representation you are proposing is closer to MLIR's representation (both its in-memory and its textual representation) than to LLVM's or is this proximity not really relevant for the purposes of your proposal? I think it's important to know which among the two is the more important question. I think MLIR is an evolution of LLVM-IR and Swift-IR, built around similar principles such as SSA and Control-Flow Graphs (I understand that in addition to CFGs, MLIR also enables structured control flow idioms). SSA is a distinguishing feature: It allows to quickly traversing def/uses (where compilers without it need a data-flow analyses), but make subtree reuse hard. Does this answer your question?
Not directly, but I think I have more information now. You are probably saying it's nearly as easy/difficult to host your approach on LLVM IR as it is on MLIR.
All the best!
~ Uday
> Note that currently there is really very little difference between MLIR's textual format for 'for'/if's and the in-memory form its passes use. The passes don't create any in-memory higher order representations for these IR units; they directly update them. There is nothing like the kind of complementary abstractions you are proposing (for cost models, copy-wise, etc.). The point I was making is that the in-memory format has references to related items such as parents and use-def chains. These are implicit in the textual format, e.g. the parent of an operation is defined by its syntactical location. When reading into memory, it is not obligatory for the objects to have all the references to related objects. Michael
-- Founder and Director, PolyMage Labs -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200208/67f62249/attachment.html>
- Previous message: [llvm-dev] Writing loop transformations on the right representation is more productive
- Next message: [llvm-dev] Writing loop transformations on the right representation is more productive
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]