[LLVMdev] SIMD for sdiv <2 x i64> (original) (raw)

zhi chen zchenhn at gmail.com
Thu Jul 23 23:06:49 PDT 2015


It seems that that it's hard to vectorize int64 in LLVM. For example, LLVM 3.4 generates very complicated code for the following IR. I am running on a Haswell processor. Is it because there is no alternative AVX/2 instructions for int64? The same thing also happens to zext <2 x i32> -> <2 x i64> and trunc <2 x i64> -> <2 x i32>. Any ideas to optimize these instructions? Thanks.

%sub.ptr.sub.i6.i.i.i.i = sub <2 x i64> %sub.ptr.lhs.cast.i4.i.i.i.i, %sub.ptr.rhs.cast.i5.i.i.i.i %sub.ptr.div.i7.i.i.i.i = sdiv <2 x i64> %sub.ptr.sub.i6.i.i.i.i, <i64 24, i64 24>

Assembly: vpsubq %xmm6, %xmm5, %xmm5 vmovq %xmm5, %rax movabsq $3074457345618258603, %rbx # imm = 0x2AAAAAAAAAAAAAAB

imulq   %rbx
movq    %rdx, %rcx

movq    %rcx, %rax

shrq    $63, %rax

shrq    $2, %rcx
addl    %eax, %ecx
vpextrq $1, %xmm5, %rax

imulq   %rbx
movq    %rdx, %rax

shrq    $63, %rax

shrq    $2, %rdx
addl    %eax, %edx

movslq  %edx, %rax
vmovq   %rax, %xmm5

movslq  %ecx, %rax
vmovq   %rax, %xmm6
vpunpcklqdq %xmm5, %xmm6, %xmm5 # xmm5 = xmm6[0],xmm5[0]

-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150723/4c853c43/attachment.html>



More information about the llvm-dev mailing list