(original) (raw)

On Tue, Oct 16, 2018 at 11:12 AM Chris Lattner via cfe-dev <cfe-dev@lists.llvm.org> wrote:

On Oct 10, 2018, at 11:09 PM, Adam Nemet via cfe-dev <cfe-dev@lists.llvm.org> wrote:
Hi,

We are proposing first-class type support for a new matrix type.

Interesting! Here are some thoughts, I’m sorry but I haven’t read the responses downthread.

This is a natural extension of the current vector type with an extra dimension.
For example, this is what the IR for a matrix multiply would look like for a 4x4 matrix with element type float:

%0 = load <4 x 4 x float>, <4 x 4 x float>\* %a, align 16
%1 = load <4 x 4 x float>, <4 x 4 x float>\* %b, align 16
%2 = call <4 x 4 x float> @llvm.matrix.multiply.m4\_4f32.m4\_4f32.m4\_4f32(<4 x 4 x float> %0, <4 x 4 x float> %1)
store <4 x 4 x float> %2, <4 x 4 x float>\* %c, align 16

LLVM already has a pretty general vector type (arbitrary number of elements). I’m aware of hardware that has rectangular vectors, e.g. nvidia tensor cores, Google has a proprietary in-house design with non-square vector registers, etc.

Currently we support element-wise binary operations, matrix multiply, matrix-scalar multiply, matrix transpose, extract/insert of an element. Besides the regular full-matrix load and store, we also support loading and storing a matrix as a submatrix of a larger matrix in memory. We are also planning to implement vector-extract/insert and matrix-vector multiply.

All of these are currently implemented as intrinsics. Where applicable we also plan to support these operations with native IR instructions (e.g. add/fadd).

Ok. Makes sense, I agree that supporting the existing pointwise vector operations makes sense.

These are exposed in clang via builtins. E.g. the above operations looks like this in C/C++:

typedef float mf4x4\_t \_\_attribute\_\_((matrix\_type(4, 4)));

mf4x4\_t add(mf4x4\_t a, mf4x4\_t b) {
return \_\_builtin\_matrix\_multiply(a, b);
}

I’d recommend splitting the clang discussion from the LLVM discussion, they are completely different tradeoffs involved. I’ll focus on the LLVM IR side of things.

\*\* Benefits \*\*

Having matrices represented as IR values allows for the usual algebraic and redundancy optimizations. But most importantly, by lifting memory aliasing concerns, we can guarantee vectorization to target-specific vectors.

Right, it is basically the same benefit as having a vector type. You also get the ability to have specific alignments etc.

I think there are several options in the design space here:

1\. Do nothing to the type system, but just use the existing vector types (<16 x float> in this case) with a new set of operations.
2\. Introduce a “matrix” concept and associated operations.
3\. Introduce N-dimensional vector registers and associated operations.

Personally, I’d be in favor of going with #1 followed by #3 followed distantly by #2.

FWIW, I strongly prefer #1 to the other options.

The reason I’m opposed to a matrix \*type\* is that this is far too specific of a concept to put into LLVM. We don’t even have signedness of integers in the type system: the instruction set is the major load bearing part of the IR design, and the instruction set is extensible through intrinsics.

Strongly agree.

Arguing in favor of #1: AFAICT, you only need to add the new intrinsics to do matmul etc. You could just define them to take 1D vectors but apply math to them that interprets them as a 2D space. This is completely an IR level modeling issue, and would be a very non-invasive patch. You’re literally just adding a few intrinsics. All the pointwise operations and insert/extract/shuffles will “just work”. The frontend handles mapping 2d indices to 1D indices.

Even better, it makes it easy to support interesting row-major and col-major style operations w/o further diversification of the type system.