[llvm-dev] lifetime.start/end (original) (raw)

Ralf Jung via llvm-dev llvm-dev at lists.llvm.org
Mon Feb 8 10:37:50 PST 2021


Hi Johannes,

I still believe folding can be done regardless of their lifetime, as long as it is done consistent. Fold all pointer observations and the actual placement is not observable, we went down that road.

Sure, but that's the kind of exception that always applies when discussing optimizations. But to make my statement more precise: you cannot fold comparison of escaped pointers to false unless the pointers are definitely not equal, e.g. if their lifetimes overlap. The interesting part is, lifetimes change. I continue to argue you cannot fold if both pointers have escaped. If you do, and other things happen that fiddle with the lifetime, you might accidentally coalesce them (explicitly or implicitly).

Lifetimes can only change if that does not add new observable behavior. For example, of two lifetimes overlap in the original program, then shrinking then is not allowed since that would make previously-definitely-inequal pointers now be potentially equal.

To give an example (using C syntax but I am thinking of LLVM semantics here):

int x = (int)malloc(4); int y = (int)malloc(4); bool eq = (x == y); free(x); free(y); print(eq);

Here, moving the "free(x)" up to before the "int *y" would be incorrect since after that transformation, "eq" might be "true", which previously it never was.

My understanding is that lifetime markers are *explicitly intended to support coalescing* by permitting later phases to put allocas with disjoint lifetimes into the same stack slot. D93376 intended to keep it that way; fundamentally changing what lifetime markers are about by making them only about the content and not the address of the alloca seemed like a much more invasive change. Am I misunderstanding something? I do think they were designed for that, yes. And I do agree my change looks more invasive from that perspective. Now does D93376 help with the problems we have is my question. Or, asked differently, what is a way out of this which improves the big picture, and what is practical. For the first, limiting it to allocas seems to be the wrong way. So far, the argument seems to be that "we only use it for them", but I'm not sure why a use for other pointers would be somehow bad (https://godbolt.org/z/G8obj3). Second, limiting it to syntactic constructs is the opposite of practical. It has been said before, syntactic constraints are brittle. Beyond that, the syntactic def-use chain requirement has not even come up in any of this mails, at least as far as I understand it. It still is unclear to me how that would help.

The restriction to alloca arises (I think) entirely from the fact that some syntactic restriction is believed to be needed if the lifetime markers are supposed to be able to affect whether allocations (that have their addresses observed) may overlap in memory. For malloc, a syntactic restriction clearly makes no sense, so if a syntactic restriction is needed then only allowing alloca makes that much more feasible.

The "proper" way to do this would be to decorate the alloca with annotations saying for which parts of the function the allocation has to be live, and to use that to define which alloca may observably overlap in memory and which may not. This would have a rather clear operational semantics. But sadly that's not how the lifetime markers have been designed, so the syntactic restriction is a way to "recover" that kind of a design: if each lifetime marker can be clearly associated to some alloca, then it's "as if" that marker was just an annotation at the alloca, and we can proceed as above.

(I do understand that these annotations would probably be not very practical, and one can view the markers as a convenient way for the frontend to convey those annotations to LLVM. But if that's what they are then they should be treated like that -- possibly by having some pre-processing pass that translates them to annotations and removes the markers, so later optimizations don't have to worry about this any more. Maybe this would be a possible fallback plan if your suggestion does not pan out.)

That all said, let's see what we need: 1) A way to ensure we don't create inconsistent situations wrt. allocation addresses where there were none. 2) A way to ensure we can reasonably coalesce allocas to minimize stack usage.

First and foremost we need a precise specification for what the operational behavior of any given LLVM IR program is -- and D93376 helps with that by proposing a clear operational behavior for the lifetime markers.

That was the original goal, anyway. (Or so I think -- I am not an author of that change proposal.) Since then the scope broadened once you pointed out that comparison folding is still tricky even after we have a precise semantics for allocation.

For one, I feel we converged pretty much on the fact that pointer comparison folding is a problem. In the example below we have lifetime markers and we use them to argue about lifetimes of the allocations, but we still agree the folding should not happen. Once the folding is restricted to pointer pairs for which at least one doesn't have it's address observed, I would like to revisit 1). So far, I believe it would be resolved. For 2), we could stick with what we have now, or adapt something new afterwards. I guess the existing wording needs to be cleared up either way.

What we have now (in terms of specification for the lifetime markers) is, as far as I am concerned, a rather vague sketch that should be turned into something more precise and operational. D93376 helps fix that, but it turns out (I guess unsurprisingly) that there is disagreement about what the more precise semantics even should be.

Note that we do not need to talk about coalescing here, we are just exploiting the fact that allocation is non-deterministic and that allocations with non-overlapping lifetime can be in the same spot or in different spots. Sure. When I refer to coalescing I was also including "implicit coalescing", e.g., as happened in my outlining example.

Oh I see. The question of "is this an allowed program transformation" and "is this an allowed execution" are closely related of course, but I still usually think of them very differently, in particular when there is lots of non-determinism as in our case.

After all, there's a difference between "this comparison might yield false at runtime" and "the compiler can replace this comparison by false" -- as we just discussed.

I am basically struggling to understand why we need to introduce syntactic restrictions on lifetime markers and why they need to be limited to allocas. Those are two concepts included in D93376, right?

I gave my view on this above.

I agree that making the markers just erase the contents of the alloca is the simplest proposal on the table. If we truly can make LLVM follow that model, that would be very nice. I would just be very surprised if it was possible to get this proposal past the people that will carefully watch over the existing optimization potential LLVM has, and make sure it does not become smaller. ;)  Maybe I am too pessimistic, but I would have never dared even making such a proposal as I would expect it to be shot down. I'll be happy to be proven wrong about this. :) I've made a few actually, mustprogress was the latest and you were involved ;). It does prevent certain optimizations we did before unconditionally but it also fixes a conceptual problem.

That's fair! And thanks for that btw. :)

At least the pointer inconsistency is a conceptual problem that needs fixing. What we do with lifetime markers is probably orthogonal.

I feel like there's also a conceptual problem with lifetime markers, because their spec is so vague that it becomes hard to say in general what their observable effect on program behavior is. Fundamentally, that's what I am most interested in fixing. I am not sure to what extend I can help with the pointer inconsistency -- the spec is fairly clear there it seems, so if that's a transformation LLVM is currently doing, then it probably just has to stop doing that?

Kind regards, Ralf

Website: https://people.mpi-sws.org/~jung/



More information about the llvm-dev mailing list