(original) (raw)

Hi!

I would appreciate some feedback from someone with experience in SCEV/SE. D39346 tries to fix an issue in LV (PR34965) that exposes a limitation in SCEV/SE. The best solution to the LV issue might not be a fix at SCEV/SE level but we may want to report/address SCEV/SE limitation as well.

For the snippet below, LV expects SE to return a SCEVAddRecExpr for %21\. However, SE returns ((4 \* (zext i32 (2 + %18) to i64)) + undef), probably because it’s not able to deduce that %bc.resume.val1 = %bc.resume.val + 1. You can find further information in D39346 and the full test in PR34965 (“IR\_after\_createVectorizedLoopSkeleton” attachment).

I could open a new bug against SCEV/SE if a fix or extension is feasible at that level.

Thanks,

Diego

----

.outer: ; preds = %2, %0

%.ph2 = phi i32 \[ 62, %2 \], \[ 110, %0 \]

%5 = add i32 %.ph2, 1

vector.ph: ; preds = %.outer

%ind.end = add i32 %5, %n.vec

%ind.end2 = add i32 %.ph2, %n.vec

middle.block: ; preds = %vector.body.latch

scalar.ph: ; preds = %middle.block, %.outer

%bc.resume.val = phi i32 \[ %ind.end, %middle.block \], \[ %5, %.outer \]

%bc.resume.val1 = phi i32 \[ %ind.end2, %middle.block \], \[ %.ph2, %.outer \]

br label %16

; :16: ; preds = %16, %scalar.ph

%17 = phi i32 \[ %bc.resume.val, %scalar.ph \], \[ %23, %16 \]

%18 = phi i32 \[ %bc.resume.val1, %scalar.ph \], \[ %17, %16 \]

%19 = add i32 %18, 2

%20 = zext i32 %19 to i64

%21 = getelementptr inbounds i32, i32 addrspace(1)\* undef, i64 %20

%22 = ashr i32 undef, %4

store i32 %22, i32 addrspace(1)\* %21, align 4

%23 = add i32 %17, 1

%24 = icmp sgt i32 %23, 61

br i1 %24, label %.\_crit\_edge.loopexit.preheader, label %16