Between opt-level=1 and opt-level=2, LLVM deoptimizes the output of ptr::addr, if it ptr::addr is #[inline] · Issue #103285 · rust-lang/rust (original) (raw)

godbolt demo: https://godbolt.org/z/51o3hYjvv

This code:

#[inline] fn inline_addr(ptr: *const T) -> usize { unsafe { core::mem::transmute(ptr) } }

pub fn inline_is_nonoverlapping(src: *const u8, dst: *const u8) -> usize { let src_usize = inline_addr(src); let dst_usize = inline_addr(dst); if src_usize > dst_usize { let _ = src_usize - dst_usize; } src_usize }

At opt-level=2 compiles to this:

example::inline_is_nonoverlapping: cmp rdi, rsi jbe .LBB0_3 jb .LBB0_2 .LBB0_3: mov rax, rdi ret .LBB0_2: push rax lea rdi, [rip + str.0] lea rdx, [rip + .L__unnamed_1] mov esi, 33 call qword ptr [rip + core::panicking::panic@GOTPCREL] ud2

Decreasing to opt-level=1 or removing the #[inline] on the transmute wrapper function causes the panic path to go away.

As far as I can tell, this doesn't happen on stable but it happens on beta and nightly. This code pattern is used in the standard library here:

pub(crate) fn is_nonoverlapping<T>(src: *const T, dst: *const T, count: usize) -> bool {
let src_usize = src.addr();
let dst_usize = dst.addr();
let size = mem::size_of::<T>().checked_mul(count).unwrap();
let diff = if src_usize > dst_usize { src_usize - dst_usize } else { dst_usize - src_usize };

@rustbot label +A-LLVM +regression-from-stable-to-beta