mul nuw
+lshr exact
should fold to a single multiplication (when the latter is a factor) · Issue #54824 · llvm/llvm-project (original) (raw)
Alive for the specific example, with opt+llc: https://alive2.llvm.org/ce/z/7oofsh
+label: llvm:optimizations
I tried the following:
define i64 @src(i64 noundef %0) { start: %1 = mul nuw nsw i64 %0, 52 %2 = lshr exact i64 %1, 2 ret i64 %2 }
Since 52 = 4 * 13
I expected to see that fold down to just a single mul
:
define i64 @tgt(i64 noundef %0) { start: %1 = mul nuw nsw i64 %0, 13 ret i64 %1 }
But it doesn't -- the mul
and lshr
are left in the optimized IR, and aren't folded by the target either:
src: # @src imul rax, rdi, 52 shr rax, 2 ret
Whereas the single-multiplication version ends up not even needing an imul
:
tgt: # @tgt lea rax, [rdi + 2rdi] lea rax, [rdi + 4rax] ret
So in general, %1 = mul nuw %0, CONST1; %2 = lshr exact %1, CONST2
should simplify to a single mul nuw
whenever CONST1
is a multiple of 1 << CONST2
.
In fact, it looks like this already happens in opt
if written with div
: https://alive2.llvm.org/ce/z/L-VouQ
%1 = mul nuw i64 %0, 52 %2 = udiv exact i64 %1, 4 => %1 = mul nuw nsw i64 %0, 13
So that code path just needs to take into account places where the udiv -> lshr
transformation had already happened.