[LLVMdev] SIMD for sdiv <2 x i64> (original) (raw)
zhi chen zchenhn at gmail.com
Thu Jul 23 23:06:49 PDT 2015
- Previous message: [LLVMdev] sepertate the code by the macro
- Next message: [LLVMdev] SIMD for sdiv <2 x i64>
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
It seems that that it's hard to vectorize int64 in LLVM. For example, LLVM 3.4 generates very complicated code for the following IR. I am running on a Haswell processor. Is it because there is no alternative AVX/2 instructions for int64? The same thing also happens to zext <2 x i32> -> <2 x i64> and trunc <2 x i64> -> <2 x i32>. Any ideas to optimize these instructions? Thanks.
%sub.ptr.sub.i6.i.i.i.i = sub <2 x i64> %sub.ptr.lhs.cast.i4.i.i.i.i, %sub.ptr.rhs.cast.i5.i.i.i.i %sub.ptr.div.i7.i.i.i.i = sdiv <2 x i64> %sub.ptr.sub.i6.i.i.i.i, <i64 24, i64 24>
Assembly: vpsubq %xmm6, %xmm5, %xmm5 vmovq %xmm5, %rax movabsq $3074457345618258603, %rbx # imm = 0x2AAAAAAAAAAAAAAB
imulq %rbx
movq %rdx, %rcx
movq %rcx, %rax
shrq $63, %rax
shrq $2, %rcx
addl %eax, %ecx
vpextrq $1, %xmm5, %rax
imulq %rbx
movq %rdx, %rax
shrq $63, %rax
shrq $2, %rdx
addl %eax, %edx
movslq %edx, %rax
vmovq %rax, %xmm5
movslq %ecx, %rax
vmovq %rax, %xmm6
vpunpcklqdq %xmm5, %xmm6, %xmm5 # xmm5 = xmm6[0],xmm5[0]
-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150723/4c853c43/attachment.html>
- Previous message: [LLVMdev] sepertate the code by the macro
- Next message: [LLVMdev] SIMD for sdiv <2 x i64>
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]