add inline to copy_within by PSeitz · Pull Request #148345 · rust-lang/rust (original) (raw)

Does the assembly contain calls to copy_within or to memmove? #[inline] will only help in the first case, not the second. In general, if LLVM decides to use memmove, that's mostly the right call – at least on GNU/Linux, memmove takes advantage of the SIMD features of the current CPU, which can end up being much faster than an unrolled implementation that can only use the CPU features of the target.

It contains calls to memmove. Inlining will also help in this case, because without inlining llvm looses the fixed size information.
LLVM will replace calls to memmove with custom assembly in trivial cases, since e.g. a 8 byte copy is already faster than the libc function call.

Here's an example for both cases, the fixed 18bytes version does not contain a call to memmove:
https://godbolt.org/z/1xz7Gevbd