(original) (raw)

Clang's -target option is supposed to take a cpu type and an operating system. So "-target i386" is giving it no operatiing system. This is preventing frame pointer elimination which is why ebp is being updated. If you pass "-target i386-linux" you get sightly better code.

The division/remainder operations are turned into library calls as part of instruction selection. This code is somewhat independent of how other calls are handled. We probably don't support tail calls in it. Is it really realistic that a user would have a non-inlined function that contains just a division? Why should we optimize for that case?

\~Craig

On Sat, Dec 1, 2018 at 9:37 AM Stefan Kanthak via llvm-dev <llvm-dev@lists.llvm.org> wrote:

Compile the following functions with "-O3 -target i386"
(see <https://godbolt.org/z/VmKlXL>):

long long div(long long foo, long long bar)
{
return foo / bar;
}

On the left the generated code; on the right the expected,
properly optimised code:

div: # @div
push ebp |
mov ebp, esp |
push dword ptr \[ebp + 20\] |
push dword ptr \[ebp + 16\] |
push dword ptr \[ebp + 12\] |
push dword ptr \[ebp + 8\] |
call \_\_divdi3 | jmp \_\_divdi3
add esp, 16 |
pop ebp |
ret |

long long mod(long long foo, long long bar)
{
return foo % bar;
}

mod: # @mod
push ebp |
mov ebp, esp |
push dword ptr \[ebp + 20\] |
push dword ptr \[ebp + 16\] |
push dword ptr \[ebp + 12\] |
push dword ptr \[ebp + 8\] |
call \_\_moddi3 | jmp \_\_moddi3
add esp, 16 |
pop ebp |
ret |

long long mul(long long foo, long long bar)
{
return foo \* bar;
}

mul: # @mul
push ebp
mov ebp, esp
push esi
mov ecx, dword ptr \[ebp + 16\]
mov esi, dword ptr \[ebp + 8\]
mov eax, ecx
imul ecx, dword ptr \[ebp + 12\]
mul esi
imul esi, dword ptr \[ebp + 20\]
add edx, ecx
add edx, esi
pop esi
pop ebp
ret
\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_
LLVM Developers mailing list
llvm-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev