[LLVMdev] Autovectorization questions (original) (raw)
Raul Silvera rsilvera at google.com
Wed Mar 12 17:26:43 PDT 2014
- Previous message: [LLVMdev] Autovectorization questions
- Next message: [LLVMdev] Autovectorization questions
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Even without wrapping around the end of the address space, without restrict you still have to worry about A and B overlapping on interesting ways. Will LLVM do some runtime dependence checks to discount such potential overlap? Just curious....
On Wed, Mar 12, 2014 at 5:01 PM, Arnold Schwaighofer < aschwaighofer at apple.com> wrote:
Zinovy,
to clarify: the code is vectorizable. But LLVM currently fails to prove it is. On Mar 12, 2014, at 3:50 PM, Arnold Schwaighofer <aschwaighofer at apple.com> wrote: > In order to vectorize code like this LLVM needs to prove that “A[i*7]” does not wrap in the address space. It fails to do so and so LLVM doesn’t vectorize this loop even if we try to force it. > > The following loop will be vectorized if we force it: > > int foo(int * A, int * B, int n, int k) { > for (int i = 0; i < 1024; ++i)_ _> A[i] += B[i*k]; > } > > So will this loop: > > int foo(int * restrict A, int * restrict B, int n, int k) { > for (int i = 0; i < n; ++i)_ _> A[i] += B[i*k]; > } > > I will update the example. > > Thanks, > Arnold > > On Mar 12, 2014, at 1:54 PM, Nadav Rotem <nrotem at apple.com> wrote: > >> Hi Zinovy, >> >> The loop vectorizer probably decided that it was not profitable to vectorize the function. You can force the vectorization of the function by setting a low threshold. >> >> Thanks, >> Nadav >> >> On Mar 12, 2014, at 3:34 AM, Zinovy Nis <zinovy.nis at gmail.com> wrote: >> >>> Hi, >>> >>> I'm reading "http://llvm.org/docs/Vectorizers.html" and have few question. Hope someone has answers on it. >>> >>> >>> The Loop Vectorizer can vectorize code that becomes a sequence of scalar instructions that scatter/gathers memory. ( http://llvm.org/docs/Vectorizers.html#scatter-gather) >>> >>> int foo(int *A, int *B, int n, int k) { >>> for (int i = 0; i < n; ++i)_ _>>> A[i7] += B[ik]; >>> } >>> >>> I replaced "int *A"/"int *B" into "double *A"/"double *B" and then compiled the sample with >>> >>> $> ./clang -Ofast -ffast-math test.c -std=c99 -march=core-avx2 -S -o bb.S -fslp-vectorize-aggressive >>> >>> and loop body looks like: >>> >>> .LBB12: # %for.body >>> # =>This Inner Loop Header: Depth=1 >>> cltq >>> vmovsd (%rsi,%rax,8), %xmm0 >>> movq %r9, %r10 >>> sarq $32, %r10 >>> vaddsd (%rdi,%r10,8), %xmm0, %xmm0 >>> vmovsd %xmm0, (%rdi,%r10,8) >>> addq %r8, %r9 >>> addl %ecx, %eax >>> decl %edx >>> jne .LBB12 >>> >>> so vector instructions for scalars (vaddsd, vmovsd) were used in the loop and no real gather/scatter emitted. >>> >>> The question is why this loop was not vectorized? Typo in docs? >>> _>>> ________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> _>> ________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >
LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-- Raúl E. Silvera -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140312/86527c55/attachment.html>
- Previous message: [LLVMdev] Autovectorization questions
- Next message: [LLVMdev] Autovectorization questions
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]