(original) (raw)
Hi!
I would appreciate some feedback from someone with experience in SCEV/SE. D39346 tries to fix an issue in LV (PR34965) that exposes a limitation in SCEV/SE. The best solution to the LV issue might not be a fix at SCEV/SE level but we may want to report/address SCEV/SE limitation as well.
For the snippet below, LV expects SE to return a SCEVAddRecExpr for %21\. However, SE returns ((4 \* (zext i32 (2 + %18) to i64)) + undef), probably because it’s not able to deduce that %bc.resume.val1 = %bc.resume.val + 1. You can find further information in D39346 and the full test in PR34965 (“IR\_after\_createVectorizedLoopSkeleton” attachment).
I could open a new bug against SCEV/SE if a fix or extension is feasible at that level.
Thanks,
Diego
----
.outer: ; preds = %2, %0
…
%.ph2 = phi i32 \[ 62, %2 \], \[ 110, %0 \]
%5 = add i32 %.ph2, 1
…
vector.ph: ; preds = %.outer
…
%ind.end = add i32 %5, %n.vec
%ind.end2 = add i32 %.ph2, %n.vec
…
middle.block: ; preds = %vector.body.latch
…
scalar.ph: ; preds = %middle.block, %.outer
%bc.resume.val = phi i32 \[ %ind.end, %middle.block \], \[ %5, %.outer \]
%bc.resume.val1 = phi i32 \[ %ind.end2, %middle.block \], \[ %.ph2, %.outer \]
br label %16
; :16: ; preds = %16, %scalar.ph
%17 = phi i32 \[ %bc.resume.val, %scalar.ph \], \[ %23, %16 \]
%18 = phi i32 \[ %bc.resume.val1, %scalar.ph \], \[ %17, %16 \]
%19 = add i32 %18, 2
%20 = zext i32 %19 to i64
%21 = getelementptr inbounds i32, i32 addrspace(1)\* undef, i64 %20
%22 = ashr i32 undef, %4
store i32 %22, i32 addrspace(1)\* %21, align 4
%23 = add i32 %17, 1
%24 = icmp sgt i32 %23, 61
br i1 %24, label %.\_crit\_edge.loopexit.preheader, label %16