Possible missing optimization targeting i686-pc-windows (original) (raw)

September 13, 2025, 9:18pm 1

define i32 @grug(ptr nofree nonnull noundef %fnptr) local_unnamed_addr {
   %retp = alloca i32

   call void %fnptr(ptr %retp)

   %retv = load i32, ptr %retp
   ret i32 %retv
}

compile with -O3 --target=i686-pc-windows
turns into

_grug:
  push eax
  mov eax, esp
  push eax
  call dword ptr [esp + 12]
  add esp, 4
  mov eax, dword ptr [esp]
  pop ecx
  ret

afaict the second and third instructions can be replaced with push esp no? unless there’s a very small performance benefit i am not aware of

godbolt:

That’s probably a missed optimization, yes; please file a bug at GitHub · Where software is built. (I’m pretty sure push reads the unadjusted value of esp? Would need to check the x86 manual.)

hansw2000 September 15, 2025, 5:51pm 3

I think this (mov to push conversion) is handled by X86CallFrameOptimization.cpp, and I suspect it bails out when it sees the stack pointer as an operand.