Use a faster allocation size check in slice::from_raw_parts by saethlin · Pull Request #103287 · rust-lang/rust (original) (raw)

I've been perusing through the codegen changes that result from turning on the standard library debug assertions. The previous check in here uses saturating arithmetic, which in my experience sometimes makes LLVM just fail to optimize things around the saturating operation.

Here is a demo of the codegen difference: https://godbolt.org/z/WMEqrjajW
Before:

example::len_check_old: mov rax, rdi mov ecx, 3 mul rcx setno cl test rax, rax setns al and al, cl ret

example::len_check_old: mov rax, rdi mov ecx, 8 mul rcx setno cl test rax, rax setns al and al, cl ret

After:

example::len_check_new: movabs rax, 3074457345618258603 cmp rdi, rax setb al ret

example::len_check_new: shr rdi, 60 sete al ret

Running rustc-perf locally, this looks like up to a 4.5% improvement when debug-assertions-std = true.

Thanks @LegionMammal978 (I think that's you?) for turning my idea into a much cleaner implementation.

r? @thomcc