[MLIR][Vectorize] Divsion with reminder (original) (raw)
I want to know what happens when divide vector dimension size by vectorization factor has a remainder.
such as:
affine.for %arg2 = 0 to 64 step 5 {
affine.for %arg3 = 0 to 64 step 4 {
%cst = arith.constant 0.000000e+00 : f32
%0 = vector.transfer_read %arg0[%arg2, %arg3], %cst : memref<64x64xf32>, vector<5x4xf32>
%cst_0 = arith.constant 0.000000e+00 : f32
%1 = vector.transfer_read %arg1[%arg2, %arg3], %cst_0 : memref<64x64xf32>, vector<5x4xf32>
%2 = arith.addf %0, %1 : vector<5x4xf32>
vector.transfer_write %2, %alloc[%arg2, %arg3] : vector<5x4xf32>, memref<64x64xf32>
}
}
If there is reminder after division by VF you should see masked vectorization in effect.
MLS June 19, 2024, 2:15am 3
But I didn’t find any implementation of masked vectorization in Affine’s SuperVectorize.
dasdibye June 19, 2024, 10:23am 4
some masking support is there as can be seen in example provided in the supervectorize.cpp file but maybe doesnt cover all cases ?
#map = affine_map<(d0) → (-d0 + 500)>
func @vecred(%arg0: memref<512xf32>) → f32 {
%cst = arith.constant 0.000000e+00 : f32
%cst_0 = arith.constant dense<0.000000e+00> : vector<128xf32>
%0 = affine.for %arg1 = 0 to 500 step 128 iter_args(%arg2 = %cst_0)
→ (vector<128xf32>) {
// %2 is the number of iterations left in the original loop.
%2 = affine.apply #map(%arg1)
%3 = vector.create_mask %2 : vector<128xi1>
%cst_1 = arith.constant 0.000000e+00 : f32
%4 = vector.transfer_read %arg0[%arg1], %cst_1 :
memref<512xf32>, vector<128xf32>
%5 = math.cos %4 : vector<128xf32>
%6 = arith.addf %arg2, %5 : vector<128xf32>
// We filter out the effect of last 12 elements using the mask.
%7 = select %3, %6, %arg2 : vector<128xi1>, vector<128xf32>
affine.yield %7 : vector<128xf32>
}
%1 = vector.reduction , %0 : vector<128xf32> into f32
return %1 : f32
}
MLS June 20, 2024, 6:50am 5
Got it, thank you.And the vectorizing reductions is supported only for 1-D vectors.