Between opt-level=1 and opt-level=2, LLVM deoptimizes the output of ptr::addr, if it ptr::addr is #[inline] · Issue #103285 · rust-lang/rust (original) (raw)
godbolt demo: https://godbolt.org/z/51o3hYjvv
This code:
#[inline] fn inline_addr(ptr: *const T) -> usize { unsafe { core::mem::transmute(ptr) } }
pub fn inline_is_nonoverlapping(src: *const u8, dst: *const u8) -> usize { let src_usize = inline_addr(src); let dst_usize = inline_addr(dst); if src_usize > dst_usize { let _ = src_usize - dst_usize; } src_usize }
At opt-level=2
compiles to this:
example::inline_is_nonoverlapping: cmp rdi, rsi jbe .LBB0_3 jb .LBB0_2 .LBB0_3: mov rax, rdi ret .LBB0_2: push rax lea rdi, [rip + str.0] lea rdx, [rip + .L__unnamed_1] mov esi, 33 call qword ptr [rip + core::panicking::panic@GOTPCREL] ud2
Decreasing to opt-level=1
or removing the #[inline]
on the transmute wrapper function causes the panic path to go away.
As far as I can tell, this doesn't happen on stable but it happens on beta and nightly. This code pattern is used in the standard library here:
pub(crate) fn is_nonoverlapping<T>(src: *const T, dst: *const T, count: usize) -> bool { |
---|
let src_usize = src.addr(); |
let dst_usize = dst.addr(); |
let size = mem::size_of::<T>().checked_mul(count).unwrap(); |
let diff = if src_usize > dst_usize { src_usize - dst_usize } else { dst_usize - src_usize }; |
@rustbot label +A-LLVM +regression-from-stable-to-beta