Possible missing optimization targeting i686-pc-windows (original) (raw)
September 13, 2025, 9:18pm 1
define i32 @grug(ptr nofree nonnull noundef %fnptr) local_unnamed_addr {
%retp = alloca i32
call void %fnptr(ptr %retp)
%retv = load i32, ptr %retp
ret i32 %retv
}
compile with -O3 --target=i686-pc-windows
turns into
_grug:
push eax
mov eax, esp
push eax
call dword ptr [esp + 12]
add esp, 4
mov eax, dword ptr [esp]
pop ecx
ret
afaict the second and third instructions can be replaced with push esp no? unless there’s a very small performance benefit i am not aware of
godbolt:
That’s probably a missed optimization, yes; please file a bug at GitHub · Where software is built. (I’m pretty sure push reads the unadjusted value of esp? Would need to check the x86 manual.)
hansw2000 September 15, 2025, 5:51pm 3
I think this (mov to push conversion) is handled by X86CallFrameOptimization.cpp, and I suspect it bails out when it sees the stack pointer as an operand.