[Doc] Proposal for vector predication · llvm/llvm-project@c49b9e0 (original) (raw)
``
1
`+
==========================
`
``
2
`+
Vector Predication Roadmap
`
``
3
`+
==========================
`
``
4
+
``
5
`+
.. contents:: Table of Contents
`
``
6
`+
:depth: 3
`
``
7
`+
:local:
`
``
8
+
``
9
`+
Motivation
`
``
10
`+
==========
`
``
11
+
``
12
`+
This proposal defines a roadmap towards native vector predication in LLVM,
`
``
13
`+
specifically for vector instructions with a mask and/or an explicit vector
`
``
14
`+
length. LLVM currently has no target-independent means to model predicated
`
``
15
`+
vector instructions for modern SIMD ISAs such as AVX512, ARM SVE, the RISC-V V
`
``
16
`+
extension and NEC SX-Aurora. Only some predicated vector operations, such as
`
``
17
`+
masked loads and stores, are available through intrinsics [MaskedIR]_.
`
``
18
+
``
19
`+
The Vector Predication (VP) extensions is a concrete RFC and prototype
`
``
20
`+
implementation to achieve native vector predication in LLVM. The VP prototype
`
``
21
`+
and all related discussions can be found in the VP patch on Phabricator
`
``
22
`+
[VPRFC]_.
`
``
23
+
``
24
`+
Roadmap
`
``
25
`+
=======
`
``
26
+
``
27
`+
- IR-level VP intrinsics
`
``
28
`+
`
``
29
+
``
30
`+
- There is a consensus on the semantics/instruction set of VP.
`
``
31
`+
- VP intrinsics and attributes are available on IR level.
`
``
32
- TTI has capability flags for VP (``supportsVP()``?,
``
33
``haveActiveVectorLength()``?).
``
34
+
``
35
`+
Result: VP usable for IR-level vectorizers (LV, VPlan, RegionVectorizer),
`
``
36
`+
potential integration in Clang with builtins.
`
``
37
+
``
38
`+
- CodeGen support
`
``
39
`+
`
``
40
+
``
41
`+
- VP intrinsics translate to first-class SDNodes
`
``
42
(eg ``llvm.vp.fdiv.* -> vp_fdiv``).
``
43
`+
- VP legalization (legalize explicit vector length to mask (AVX512), legalize VP
`
``
44
`+
SDNodes to pre-existing ones (SSE, NEON)).
`
``
45
+
``
46
`+
Result: Backend development based on VP SDNodes.
`
``
47
+
``
48
`+
- Lift InstSimplify/InstCombine/DAGCombiner to VP
`
``
49
`+
`
``
50
+
``
51
`+
- Introduce PredicatedInstruction, PredicatedBinaryOperator, .. helper classes
`
``
52
`+
that match standard vector IR and VP intrinsics.
`
``
53
`+
- Add a matcher context to PatternMatch and context-aware IR Builder APIs.
`
``
54
`+
- Incrementally lift DAGCombiner to work on VP SDNodes as well as on regular
`
``
55
`+
vector instructions.
`
``
56
`+
- Incrementally lift InstCombine/InstSimplify to operate on VP as well as
`
``
57
`+
regular IR instructions.
`
``
58
+
``
59
`+
Result: Optimization of VP intrinsics on par with standard vector instructions.
`
``
60
+
``
61
`+
- Deprecate llvm.masked.* / llvm.experimental.reduce.*
`
``
62
`+
`
``
63
+
``
64
`+
- Modernize llvm.masked.* / llvm.experimental.reduce* by translating to VP.
`
``
65
`+
- DCE transitional APIs.
`
``
66
+
``
67
`+
Result: VP has superseded earlier vector intrinsics.
`
``
68
+
``
69
`+
- Predicated IR Instructions
`
``
70
`+
`
``
71
+
``
72
`+
- Vector instructions have an optional mask and vector length parameter. These
`
``
73
`+
lower to VP SDNodes (from Stage 2).
`
``
74
`+
- Phase out VP intrinsics, only keeping those that are not equivalent to
`
``
75
`+
vectorized scalar instructions (reduce, shuffles, ..)
`
``
76
`+
- InstCombine/InstSimplify expect predication in regular Instructions (Stage (3)
`
``
77
`+
has laid the groundwork).
`
``
78
+
``
79
`+
Result: Native vector predication in IR.
`
``
80
+
``
81
`+
References
`
``
82
`+
==========
`
``
83
+
``
84
`` +
.. [MaskedIR] llvm.masked.*
intrinsics,
``
``
85
`+
https://llvm.org/docs/LangRef.html#masked-vector-load-and-store-intrinsics
`
``
86
+
``
87
`+
.. [VPRFC] RFC: Prototype & Roadmap for vector predication in LLVM,
`
``
88
`+
https://reviews.llvm.org/D57504
`