Stop generating alloca
s & memcmp
for simple short array equality by scottmcm · Pull Request #85828 · rust-lang/rust (original) (raw)
Added the implementation for cranelift.
I didn't bother doing anything too fancy, but it does trigger for the IPv6 case ([u16; 8]
):
@0009 v9 = load.i128 notrap v8
@0009 v10 = load.i128 notrap v7
@0009 v11 = icmp eq v9, v10
@0009 v12 = bint.i8 v11
Otherwise it lib_call
s out to memcmp
still (here for the [u16; 6]
case):
@0009 v9 = iconst.i64 12
@0009 v10 = call fn2(v8, v7, v9)
@0009 v16 = iconst.i32 0
@0009 v11 = icmp eq v10, v16
@0009 v12 = bint.i8 v11
Added another nice codegen test example.
pub fn array_eq_zero(x: [u16; 8]) -> bool { x == [0; 8] }
Before:
%x = alloca i128, align 8 store i128 %0, i128* %x, align 8 %_11.i.i.i = bitcast i128* %x to i8* %bcmp.i.i.i = call i32 @bcmp(i8* nonnull dereferenceable(16) %_11.i.i.i, i8* nonnull dereferenceable(16) getelementptr inbounds (<{ [16 x i8] }>, <{ [16 x i8] }>* @alloc2, i64 0, i32 0, i64 0), i64 16) #2, !alias.scope !2 %1 = icmp eq i32 %bcmp.i.i.i, 0 ret i1 %1
sub rsp, 16
mov qword ptr [rsp + 8], rsi
mov qword ptr [rsp], rdi
or rdi, rsi
sete al
add rsp, 16
ret
After:
%1 = icmp eq i128 %0, 0 ret i1 %1