[llvm-dev] [PATCH] D26127: [MemorySSA] Repair AccessList invariants after insertion of new MemoryUseOrDef. (original) (raw)

Daniel Berlin via llvm-dev llvm-dev at lists.llvm.org
Sun Oct 30 17:49:46 PDT 2016


On Sun, Oct 30, 2016 at 5:03 PM, Bryant Wong < 3.14472+reviews.llvm.org at gmail.com> wrote:

To give this a bit of context, this patch stems from issues that I've encountered while porting MemCpyOpt to MSSA.

Okay. I'm not sure i would try to port instead of just rewrite. The whole goal of MemorySSA is to enable us to write memory optimizations in non-N^2 ways.

If you just replace one querying with the other querying, this is not likely to give you this result. It may be faster (or slower).

MemCpyOpt currently iterates in no particular order. It should be possible to do it all in a single iteration, either bottom up or top down.

Many of the transforms in this pass

create entirely new memsets, memcpys, and memmoves (among others). After each new insertion, I expect to also have to update MSSA since 1) MemorySSAWrapperPass is delcared preserved to the pass manager, and 2) subsequent MCO iterations which depend on MSSA need to be able to see, analyze, and possibly further transform these newly added meminsts.

Can you give an example of a case where, processing in top-down order, going from defs to uses, it's not possible to do this in a single iteration?

I understand why, giving a random ordering it does it in now, and no use-def/def-use chains, it cannot do so. I cannot, off the top of my head, think of a case it would not be possible to do in a single iteration.

Here's an example: MemCpyOptPass::tryMergingIntoMemset will convert the sequences of stores to %P into a memset:

define void @foo(i64* nocapture %P, i64* %Q) { entry: %0 = bitcast i64* %P to i16* %arrayidx = getelementptr inbounds i16, i16* %0, i64 1 %1 = bitcast i16* %arrayidx to i32* %arrayidx1 = getelementptr inbounds i16, i16* %0, i64 3 ; 1 = MemoryDef(liveOnEntry) store i16 0, i16* %0, align 2 ; 2 = MemoryDef(1) store i32 0, i32* %1, align 4 ; 3 = MemoryDef(2) store i16 0, i16* %arrayidx1, align 2 ; 4 = MemoryDef(3) store i64 0, i64* %Q ret void } Specifically, a memset is first inserted right before the store to %Q: define void @foo(i64* nocapture %P, i64* %Q) { entry: %0 = bitcast i64* %P to i16* %arrayidx = getelementptr inbounds i16, i16* %0, i64 1 %1 = bitcast i16* %arrayidx to i32* %arrayidx1 = getelementptr inbounds i16, i16* %0, i64 3 ; 1 = MemoryDef(liveOnEntry) store i16 0, i16* %0, align 2 ; 2 = MemoryDef(1) store i32 0, i32* %1, align 4 ; 3 = MemoryDef(2) store i16 0, i16* %arrayidx1, align 2 call void @llvm.memset.p0i8.i64(i8* %2, i8 0, i64 8, i32 2, i1 false) ; new ; 4 = MemoryDef(3) store i64 0, i64* %Q ret void } Next, a new MemoryAccess is created by passing to createMemoryAccessBefore the memset instruction, 3 = MemoryDef(2) (the new access's defining access), and 4 = MemoryDef(3) (the new access's insertion point).

This won't work, of course.

However, this memset is being created to eliminate stores, no? What stores are you eliminating?

If it it's the stores at 1, 2, and 3, then for sure, the correct defining access is not "3 = MemoryDef".

It's going to be "liveOnEntry" after you kill the stores.

Then, RAUW on each of the MemoryDef's works fine, and you removeAccess them.

define void @foo(i64* nocapture %P, i64* %Q) { entry: %0 = bitcast i64* %P to i16* %arrayidx = getelementptr inbounds i16, i16* %0, i64 1 %1 = bitcast i16* %arrayidx to i32* %arrayidx1 = getelementptr inbounds i16, i16* %0, i64 3 ; 1 = MemoryDef(liveOnEntry) store i16 0, i16* %0, align 2 ; 2 = MemoryDef(1) store i32 0, i32* %1, align 4 ; 3 = MemoryDef(2) store i16 0, i16* %arrayidx1, align 2

; 5 = MemoryDef(3) call void @llvm.memset.p0i8.i64(i8* %2, i8 0, i64 8, i32 2, i1 false) ; new ; 4 = MemoryDef(3) store i64 0, i64* %Q ret void } This is already problematic because there's no way to selectively update the post-5 MemoryDefs that point to 5 instead of 3. Calling RAUW -- specifically, replacing all uses of 3 with 5 -- would result in this:

I'm aware the existing APIs will not do what you want, because, as i said, they are specifically and documented to only be meant for replacement.

When i originally had API's to allow for more direct manipulation, folks instead said "please build utilities to make common transforms work".

Thus, the existing API's make replacement work.

No amount of futzing is going to make them work well for insertion.

If you want on the fly updating for stores, where you aren't doing replacement, that's quite expensive.

MemCpyOpt is mostly doing replacement, however, and you can make them work if you actually do replacement as replacement.

This likely requires rethinking memcpyopt a bit. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161030/f43c30bb/attachment.html>



More information about the llvm-dev mailing list