[llvm-dev] Question on induction variable simplification pass (original) (raw)
Sanjoy Das via llvm-dev llvm-dev at lists.llvm.org
Thu Apr 13 20:34:30 PDT 2017
- Previous message: [llvm-dev] Question on induction variable simplification pass
- Next message: [llvm-dev] Question on induction variable simplification pass
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Pankaj,
On April 12, 2017 at 5:22:45 PM, Chawla, Pankaj via llvm-dev (llvm-dev at lists.llvm.org) wrote:
It looks like the induction variable simplification pass prefers doing a zero-extension to compute the wider trip count of loops when extending the IV. This can sometimes result in loss of information making ScalarEvolution's analysis conservative which can lead to missed performance opportunities.
For example, consider this loopnest- int i, j; for(i=0; i< 40; i++) for(j=0; j< i-1; j++) A[i+j] = j;
We are mainly interested in the backedge taken count of the inner loop. Before indvars, the backedge information computed by ScalarEvolution is as follows- Outer loop- backedge-taken count is 39 max backedge-taken count is 39 Inner loop- backedge-taken count is {-2,+,1}<%for.cond1.preheader> max backedge-taken count is 37 After indvars, the backedge information computed by ScalarEvolution is as follows- Outer loop- backedge-taken count is 39 max backedge-taken count is 39 Inner loop- backedge-taken count is (-1 + (zext i32 {-1,+,1}<%for.cond1.preheader> to i64)) max backedge-taken count is -1
One approach is to use the facts:
- The inner loop will not be entered in the 0th iteration of <%for.cond1.preheader> - {-1,+,1}<%for.cond1.preheader> is s< 40
to simplify the above to {-2,+,1}<%for.cond1.preheader> (in i64). The original expression was not -2 in the 0th iteration of <%for.cond1.preheader>, but we don't care about that iteration of <%for.cond1.preheader> since we won't enter the inner loop.
The other option is to widen of IVs late, as a "lowering" transformation, like Hal said. That's a more invasive change, but if you have time and resources, it would be nice to at least give it a shot, measure and see what falls over.
If we generate sext instead of zext, the information computed by ScalarEvolution is as follows-
Outer loop- backedge-taken count is 39 max backedge-taken count is 39 Inner loop- backedge-taken count is {-2,+,1}<%for.cond1.preheader> max backedge-taken count is -2 We now have a simplified backedge taken count for the inner loop which is almost as precise as before indvars, except for the flag instead of flag. I think ScalarEvolution
(JFYI: My mail client's compose ate the and )
Can you please share the IR that you piped through SCEV?
My guess is that SCEV did not "try to" infer a more aggressive no-wrap flag for {-2,+,1} -- most of the no-wrap inferring logic kicks in when you try to sign/zero extend an add recurrence.
One suspicious bit here is the "max backedge-taken count is -2" bit. I expected SCEV to have inferred the max trip count to be 37 in this second case.
-- Sanjoy
- Previous message: [llvm-dev] Question on induction variable simplification pass
- Next message: [llvm-dev] Question on induction variable simplification pass
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]