[mlir][vector] Remove bit-width logic tests by newling · Pull Request #143007 · llvm/llvm-project (original) (raw)
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation5 Commits1 Checks13 Files changed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
I would like to remove (this PR) or alternatively modify (I can post another PR if this is preferred) the logic around bit width specific linearization.
My first question is -- does anyone rely on bit-width relevant linearization logic being unchanged?
Problem encountered
for a large inner bit-with threshold extract 3x4xf32 -> 4xf32
and extract 3x2x2xf32 -> 2x2f32
are both linearized to the same shuffle. But for an intermediate threshold (100, say), only extract 3x2x2xf32 -> 2x2f32
is linearized (because 4x32 > threshold > 2x32). This is unintuitive to me -- I'd expect the decision to linearize to be based on whether the result of linearization is acceptable.
In theory, as someone who doesn't use the bit width logic at all in legalization, this shouldn't be a problem for me. And it mostly isn't, but I would like to introduce an intermediate step, where extract 3x4xf32 -> 4xf32
and extract 3x2x2xf32 -> 2x2f32
both first lower to extract_strided_slice 12xf32 -> 4xf32
before being converted to shuffle. But adding this pattern causes an issue in the bit-width specific tests, where extracts don't lower completely to shuffle.
Alternative
I can alternatively change the bit width logic to be based on the total number of elements in the vector (which is invariant to shape changes) instead of the inner dimension size.
newling marked this pull request as ready for review
@llvm/pr-subscribers-mlir-vector
Author: James Newling (newling)
Changes
I would like to remove (this PR) or alternatively modify (I can post another PR if this is preferred) the logic around bit width specific linearization.
My first question is -- does anyone rely on bit-width relevant linearization logic being unchanged?
Problem encountered
for a large inner bit-with threshold extract 3x4xf32 -> 4xf32
and extract 3x2x2xf32 -> 2x2f32
are both linearized to the same shuffle. But for an intermediate threshold (100, say), only extract 3x2x2xf32 -> 2x2f32
is linearized (because 4x32 > threshold > 2x32). This is unintuitive to me -- I'd expect the decision to linearize to be based on whether the result of linearization is acceptable.
In theory, as someone who doesn't use the bit width logic at all in legalization, this shouldn't be a problem for me. And it mostly isn't, but I would like to introduce an intermediate step, where extract 3x4xf32 -> 4xf32
and extract 3x2x2xf32 -> 2x2f32
both first lower to extract_strided_slice 12xf32 -> 4xf32
before being converted to shuffle. But adding this pattern causes an issue in the bit-width specific tests, where extracts don't lower completely to shuffle.
Alternative
I can alternatively change the bit width logic to be based on the total number of elements in the vector (which is invariant to shape changes) instead of the inner dimension size.
Full diff: https://github.com/llvm/llvm-project/pull/143007.diff
2 Files Affected:
- (removed) mlir/test/Dialect/Vector/linearize-subject-to-bitwidth.mlir (-58)
- (modified) mlir/test/lib/Dialect/Vector/TestVectorTransforms.cpp (-122)
diff --git a/mlir/test/Dialect/Vector/linearize-subject-to-bitwidth.mlir b/mlir/test/Dialect/Vector/linearize-subject-to-bitwidth.mlir deleted file mode 100644 index 739fb2fb8b68b..0000000000000 --- a/mlir/test/Dialect/Vector/linearize-subject-to-bitwidth.mlir +++ /dev/null @@ -1,58 +0,0 @@ -// RUN: mlir-opt %s -split-input-file -test-bit-width-constrained-vector-linearize=target-vector-bitwidth=128 | FileCheck %s --check-prefixes=ALL,BW-128 -// RUN: mlir-opt %s -split-input-file -test-bit-width-constrained-vector-linearize=target-vector-bitwidth=0 | FileCheck %s --check-prefixes=ALL,BW-0
-// A vector<2x2xf32> has inner-most dimension with 64-bits. Check that at -// bitwidth threshold 128 (>= 64), operations are linearized, and at -// bitwidth threshold 0 (< 64), operations are not linearized.
-// ALL-LABEL: test_result_bitwidth_64 -func.func @test_result_bitwidth_64(%arg0: vector<2x2xf32>) -> vector<2x2xf32> {
- // BW-128: arith.constant {{.*}} vector<4xf32>
- // BW-0: arith.constant {{.*}} vector<2x2xf32>
- %0 = arith.constant dense<[[1.0, 2.0], [3.0, 4.0]]> : vector<2x2xf32>
- // BW-128: math.sin {{.*}} vector<4xf32>
- // BW-0: math.sin {{.*}} vector<2x2xf32>
- %1 = math.sin %arg0 : vector<2x2xf32>
- return %0 : vector<2x2xf32> -}
- -// -----
- -// The size of the 'index' type is backend specific, so we cannot guarantee that -// the inner-most dimension below (of size 2*nbBits(index)) is below any bitwidth -// threshold. Test that operations with vectors of index type are not linearized.
- -// ALL-LABEL: test_index_no_linearize -func.func @test_index_no_linearize(%arg0: vector<2x2xindex>, %arg1: vector<2x2xindex>) -> vector<2x2xindex> {
- // BW-128: %[[ADD:.]] = arith.addi {{.}} : vector<2x2xindex>
- // BW-0: %[[ADD:.]] = arith.addi {{.}} : vector<2x2xindex>
- %0 = arith.addi %arg0, %arg1 : vector<2x2xindex>
- return %0 : vector<2x2xindex>
-}
-// -----
-// The logic for the insert op with regards to the bitwidth threshold is -// different to the other ops, so we test it here. Specifically, the logic -// is based on the bitwidth of the value to store.
-// ALL-LABEL: test_vector_insert -// ALL-SAME: (%[[DEST:.]]: vector<2x8x4xf32>, %[[SRC:.]]: vector<8x4xf32>) -> vector<2x8x4xf32> { -func.func @test_vector_insert(%arg0: vector<2x8x4xf32>, %arg1: vector<8x4xf32>) -> vector<2x8x4xf32> {
- // BW-128-DAG: %[[ARG_SRC:.*]] = vector.shape_cast %[[SRC]] : vector<8x4xf32> to vector<32xf32>
- // BW-128-DAG: %[[ARG_DEST:.*]] = vector.shape_cast %[[DEST]] : vector<2x8x4xf32> to vector<64xf32>
- // BW-128: %[[SHUFFLE:.*]] = vector.shuffle %[[ARG_DEST]], %[[ARG_SRC]]
- // BW-128: %[[RES:.*]] = vector.shape_cast %[[SHUFFLE]] : vector<64xf32> to vector<2x8x4xf32>
- // BW-128: return %[[RES]] : vector<2x8x4xf32>
- // BW-0: %[[RES:.*]] = vector.insert %[[SRC]], %[[DEST]] [0] : vector<8x4xf32> into vector<2x8x4xf32>
- // BW-0: return %[[RES]] : vector<2x8x4xf32>
- %0 = vector.insert %arg1, %arg0[0]: vector<8x4xf32> into vector<2x8x4xf32>
- return %0 : vector<2x8x4xf32> -} diff --git a/mlir/test/lib/Dialect/Vector/TestVectorTransforms.cpp b/mlir/test/lib/Dialect/Vector/TestVectorTransforms.cpp index f4f32e9339870..03db9b7fa7a73 100644 --- a/mlir/test/lib/Dialect/Vector/TestVectorTransforms.cpp +++ b/mlir/test/lib/Dialect/Vector/TestVectorTransforms.cpp @@ -837,126 +837,6 @@ struct TestVectorEmulateMaskedLoadStore final } };
-/// Get the set of operand/result types to check for sufficiently -/// small inner-most dimension size. -static SmallVector<std::pair<Type, unsigned>> -getTypeBitWidthBoundPairs(Operation *op, unsigned targetBitWidth) {
- if (auto insertOp = dyn_castvector::InsertOp(op)) {
- unsigned w = targetBitWidth < std::numeric_limits::max()
? targetBitWidth + 1
: targetBitWidth;
- return {{insertOp.getValueToStoreType(), w}};
- }
- auto resultTypes = op->getResultTypes();
- SmallVector<std::pair<Type, unsigned>> resultsWithBitWidth;
- resultsWithBitWidth.reserve(resultTypes.size());
- for (Type type : resultTypes) {
- resultsWithBitWidth.push_back({type, targetBitWidth});
- }
- return resultsWithBitWidth; -}
- -/// If
type
is VectorType with trailing dimension of (bit) size greater than -/// or equal totargetBitWidth
, its defining op is considered legal. -static bool -isNotLinearizableBecauseLargeInnerDimension(Type type, unsigned targetBitWidth) {
- VectorType vecType = dyn_cast(type);
- // Not linearizable for reasons other than what this function checks.
- if (!vecType || vecType.getRank() == 0)
- return false;
- // The width of the type 'index' is unbounded (and therefore potentially above
- // the target width).
- if (vecType.getElementType().isIndex())
- return true;
- unsigned finalDimSize = vecType.getShape().back();
- unsigned nbBitsPerElm = vecType.getElementTypeBitWidth();
- unsigned trailingVecDimBitWidth = finalDimSize * nbBitsPerElm;
- return trailingVecDimBitWidth >= targetBitWidth; -}
- -static bool -isNotLinearizableBecauseLargeInnerDimension(Operation *op,
unsigned targetBitWidth) {
- // Check on bitwidths.
- SmallVector<std::pair<Type, unsigned>> toCheck =
getTypeBitWidthBoundPairs(op, targetBitWidth);
- return llvm::any_of(toCheck, [&](std::pair<Type, unsigned> typeWidth) {
- return isNotLinearizableBecauseLargeInnerDimension(typeWidth.first,
typeWidth.second);
- }); -}
- -void populateWithBitWidthConstraints(TypeConverter &typeConverter,
ConversionTarget &target,
unsigned targetBitWidth) {
- // The general purpose definition of what ops are legal must come first.
- populateForVectorLinearize(typeConverter, target);
- // Extend the set of legal ops to include those with large inner-most
- // dimensions on selected operands/results.
- target.markUnknownOpDynamicallyLegal(
[=](Operation *op) -> std::optional<bool> {
if (isNotLinearizableBecauseLargeInnerDimension(op, targetBitWidth)) {
return true;
}
return {};
});
-}
-struct TestVectorBitWidthLinearize final - : public PassWrapper<TestVectorBitWidthLinearize, OperationPass<>> { - MLIR_DEFINE_EXPLICIT_INTERNAL_INLINE_TYPE_ID(TestVectorBitWidthLinearize)
TestVectorBitWidthLinearize() = default;
TestVectorBitWidthLinearize(const TestVectorBitWidthLinearize &pass)
: PassWrapper(pass) {}
StringRef getArgument() const override {
return "test-bit-width-constrained-vector-linearize";
}
StringRef getDescription() const override {
return "Linearizes ND vectors for N >= 2 into 1D vectors, with constraints "
"in inner-most dimension's bit width.";
}
void getDependentDialects(DialectRegistry ®istry) const override {
registry.insertvector::VectorDialect();
}
Option targetVectorBitwidth{
*this, "target-vector-bitwidth",
llvm:🆑:desc(
"Minimum vector bitwidth to enable the flattening transformation"),
llvm:🆑:init(std::numeric_limits<unsigned>::max())};
void runOnOperation() override {
auto *context = &getContext();
TypeConverter typeConverter;
RewritePatternSet patterns(context);
ConversionTarget target(*context);
populateWithBitWidthConstraints(typeConverter, target,
targetVectorBitwidth);
vector::populateVectorLinearizeBasePatterns(typeConverter, target,
patterns);
vector::populateVectorLinearizeShuffleLikeOpsPatterns(typeConverter, target,
patterns);
if (failed(applyPartialConversion(getOperation(), target,
std::move(patterns))))
return signalPassFailure();
} -};
struct TestVectorLinearize final : public PassWrapper<TestVectorLinearize, OperationPass<>> { MLIR_DEFINE_EXPLICIT_INTERNAL_INLINE_TYPE_ID(TestVectorLinearize) @@ -1064,8 +944,6 @@ void registerTestVectorLowerings() {
PassRegistration();
PassRegistration();
PassRegistration(); } } // namespace test
@llvm/pr-subscribers-mlir
Author: James Newling (newling)
Changes
I would like to remove (this PR) or alternatively modify (I can post another PR if this is preferred) the logic around bit width specific linearization.
My first question is -- does anyone rely on bit-width relevant linearization logic being unchanged?
Problem encountered
for a large inner bit-with threshold extract 3x4xf32 -> 4xf32
and extract 3x2x2xf32 -> 2x2f32
are both linearized to the same shuffle. But for an intermediate threshold (100, say), only extract 3x2x2xf32 -> 2x2f32
is linearized (because 4x32 > threshold > 2x32). This is unintuitive to me -- I'd expect the decision to linearize to be based on whether the result of linearization is acceptable.
In theory, as someone who doesn't use the bit width logic at all in legalization, this shouldn't be a problem for me. And it mostly isn't, but I would like to introduce an intermediate step, where extract 3x4xf32 -> 4xf32
and extract 3x2x2xf32 -> 2x2f32
both first lower to extract_strided_slice 12xf32 -> 4xf32
before being converted to shuffle. But adding this pattern causes an issue in the bit-width specific tests, where extracts don't lower completely to shuffle.
Alternative
I can alternatively change the bit width logic to be based on the total number of elements in the vector (which is invariant to shape changes) instead of the inner dimension size.
Full diff: https://github.com/llvm/llvm-project/pull/143007.diff
2 Files Affected:
- (removed) mlir/test/Dialect/Vector/linearize-subject-to-bitwidth.mlir (-58)
- (modified) mlir/test/lib/Dialect/Vector/TestVectorTransforms.cpp (-122)
diff --git a/mlir/test/Dialect/Vector/linearize-subject-to-bitwidth.mlir b/mlir/test/Dialect/Vector/linearize-subject-to-bitwidth.mlir deleted file mode 100644 index 739fb2fb8b68b..0000000000000 --- a/mlir/test/Dialect/Vector/linearize-subject-to-bitwidth.mlir +++ /dev/null @@ -1,58 +0,0 @@ -// RUN: mlir-opt %s -split-input-file -test-bit-width-constrained-vector-linearize=target-vector-bitwidth=128 | FileCheck %s --check-prefixes=ALL,BW-128 -// RUN: mlir-opt %s -split-input-file -test-bit-width-constrained-vector-linearize=target-vector-bitwidth=0 | FileCheck %s --check-prefixes=ALL,BW-0
-// A vector<2x2xf32> has inner-most dimension with 64-bits. Check that at -// bitwidth threshold 128 (>= 64), operations are linearized, and at -// bitwidth threshold 0 (< 64), operations are not linearized.
-// ALL-LABEL: test_result_bitwidth_64 -func.func @test_result_bitwidth_64(%arg0: vector<2x2xf32>) -> vector<2x2xf32> {
- // BW-128: arith.constant {{.*}} vector<4xf32>
- // BW-0: arith.constant {{.*}} vector<2x2xf32>
- %0 = arith.constant dense<[[1.0, 2.0], [3.0, 4.0]]> : vector<2x2xf32>
- // BW-128: math.sin {{.*}} vector<4xf32>
- // BW-0: math.sin {{.*}} vector<2x2xf32>
- %1 = math.sin %arg0 : vector<2x2xf32>
- return %0 : vector<2x2xf32> -}
- -// -----
- -// The size of the 'index' type is backend specific, so we cannot guarantee that -// the inner-most dimension below (of size 2*nbBits(index)) is below any bitwidth -// threshold. Test that operations with vectors of index type are not linearized.
- -// ALL-LABEL: test_index_no_linearize -func.func @test_index_no_linearize(%arg0: vector<2x2xindex>, %arg1: vector<2x2xindex>) -> vector<2x2xindex> {
- // BW-128: %[[ADD:.]] = arith.addi {{.}} : vector<2x2xindex>
- // BW-0: %[[ADD:.]] = arith.addi {{.}} : vector<2x2xindex>
- %0 = arith.addi %arg0, %arg1 : vector<2x2xindex>
- return %0 : vector<2x2xindex>
-}
-// -----
-// The logic for the insert op with regards to the bitwidth threshold is -// different to the other ops, so we test it here. Specifically, the logic -// is based on the bitwidth of the value to store.
-// ALL-LABEL: test_vector_insert -// ALL-SAME: (%[[DEST:.]]: vector<2x8x4xf32>, %[[SRC:.]]: vector<8x4xf32>) -> vector<2x8x4xf32> { -func.func @test_vector_insert(%arg0: vector<2x8x4xf32>, %arg1: vector<8x4xf32>) -> vector<2x8x4xf32> {
- // BW-128-DAG: %[[ARG_SRC:.*]] = vector.shape_cast %[[SRC]] : vector<8x4xf32> to vector<32xf32>
- // BW-128-DAG: %[[ARG_DEST:.*]] = vector.shape_cast %[[DEST]] : vector<2x8x4xf32> to vector<64xf32>
- // BW-128: %[[SHUFFLE:.*]] = vector.shuffle %[[ARG_DEST]], %[[ARG_SRC]]
- // BW-128: %[[RES:.*]] = vector.shape_cast %[[SHUFFLE]] : vector<64xf32> to vector<2x8x4xf32>
- // BW-128: return %[[RES]] : vector<2x8x4xf32>
- // BW-0: %[[RES:.*]] = vector.insert %[[SRC]], %[[DEST]] [0] : vector<8x4xf32> into vector<2x8x4xf32>
- // BW-0: return %[[RES]] : vector<2x8x4xf32>
- %0 = vector.insert %arg1, %arg0[0]: vector<8x4xf32> into vector<2x8x4xf32>
- return %0 : vector<2x8x4xf32> -} diff --git a/mlir/test/lib/Dialect/Vector/TestVectorTransforms.cpp b/mlir/test/lib/Dialect/Vector/TestVectorTransforms.cpp index f4f32e9339870..03db9b7fa7a73 100644 --- a/mlir/test/lib/Dialect/Vector/TestVectorTransforms.cpp +++ b/mlir/test/lib/Dialect/Vector/TestVectorTransforms.cpp @@ -837,126 +837,6 @@ struct TestVectorEmulateMaskedLoadStore final } };
-/// Get the set of operand/result types to check for sufficiently -/// small inner-most dimension size. -static SmallVector<std::pair<Type, unsigned>> -getTypeBitWidthBoundPairs(Operation *op, unsigned targetBitWidth) {
- if (auto insertOp = dyn_castvector::InsertOp(op)) {
- unsigned w = targetBitWidth < std::numeric_limits::max()
? targetBitWidth + 1
: targetBitWidth;
- return {{insertOp.getValueToStoreType(), w}};
- }
- auto resultTypes = op->getResultTypes();
- SmallVector<std::pair<Type, unsigned>> resultsWithBitWidth;
- resultsWithBitWidth.reserve(resultTypes.size());
- for (Type type : resultTypes) {
- resultsWithBitWidth.push_back({type, targetBitWidth});
- }
- return resultsWithBitWidth; -}
- -/// If
type
is VectorType with trailing dimension of (bit) size greater than -/// or equal totargetBitWidth
, its defining op is considered legal. -static bool -isNotLinearizableBecauseLargeInnerDimension(Type type, unsigned targetBitWidth) {
- VectorType vecType = dyn_cast(type);
- // Not linearizable for reasons other than what this function checks.
- if (!vecType || vecType.getRank() == 0)
- return false;
- // The width of the type 'index' is unbounded (and therefore potentially above
- // the target width).
- if (vecType.getElementType().isIndex())
- return true;
- unsigned finalDimSize = vecType.getShape().back();
- unsigned nbBitsPerElm = vecType.getElementTypeBitWidth();
- unsigned trailingVecDimBitWidth = finalDimSize * nbBitsPerElm;
- return trailingVecDimBitWidth >= targetBitWidth; -}
- -static bool -isNotLinearizableBecauseLargeInnerDimension(Operation *op,
unsigned targetBitWidth) {
- // Check on bitwidths.
- SmallVector<std::pair<Type, unsigned>> toCheck =
getTypeBitWidthBoundPairs(op, targetBitWidth);
- return llvm::any_of(toCheck, [&](std::pair<Type, unsigned> typeWidth) {
- return isNotLinearizableBecauseLargeInnerDimension(typeWidth.first,
typeWidth.second);
- }); -}
- -void populateWithBitWidthConstraints(TypeConverter &typeConverter,
ConversionTarget &target,
unsigned targetBitWidth) {
- // The general purpose definition of what ops are legal must come first.
- populateForVectorLinearize(typeConverter, target);
- // Extend the set of legal ops to include those with large inner-most
- // dimensions on selected operands/results.
- target.markUnknownOpDynamicallyLegal(
[=](Operation *op) -> std::optional<bool> {
if (isNotLinearizableBecauseLargeInnerDimension(op, targetBitWidth)) {
return true;
}
return {};
});
-}
-struct TestVectorBitWidthLinearize final - : public PassWrapper<TestVectorBitWidthLinearize, OperationPass<>> { - MLIR_DEFINE_EXPLICIT_INTERNAL_INLINE_TYPE_ID(TestVectorBitWidthLinearize)
TestVectorBitWidthLinearize() = default;
TestVectorBitWidthLinearize(const TestVectorBitWidthLinearize &pass)
: PassWrapper(pass) {}
StringRef getArgument() const override {
return "test-bit-width-constrained-vector-linearize";
}
StringRef getDescription() const override {
return "Linearizes ND vectors for N >= 2 into 1D vectors, with constraints "
"in inner-most dimension's bit width.";
}
void getDependentDialects(DialectRegistry ®istry) const override {
registry.insertvector::VectorDialect();
}
Option targetVectorBitwidth{
*this, "target-vector-bitwidth",
llvm:🆑:desc(
"Minimum vector bitwidth to enable the flattening transformation"),
llvm:🆑:init(std::numeric_limits<unsigned>::max())};
void runOnOperation() override {
auto *context = &getContext();
TypeConverter typeConverter;
RewritePatternSet patterns(context);
ConversionTarget target(*context);
populateWithBitWidthConstraints(typeConverter, target,
targetVectorBitwidth);
vector::populateVectorLinearizeBasePatterns(typeConverter, target,
patterns);
vector::populateVectorLinearizeShuffleLikeOpsPatterns(typeConverter, target,
patterns);
if (failed(applyPartialConversion(getOperation(), target,
std::move(patterns))))
return signalPassFailure();
} -};
struct TestVectorLinearize final : public PassWrapper<TestVectorLinearize, OperationPass<>> { MLIR_DEFINE_EXPLICIT_INTERNAL_INLINE_TYPE_ID(TestVectorLinearize) @@ -1064,8 +944,6 @@ void registerTestVectorLowerings() {
PassRegistration();
PassRegistration();
PassRegistration(); } } // namespace test
Hey @newling !
My first question is -- does anyone rely on bit-width relevant linearization logic being unchanged?
Give no feedback so far, this one will probably require a post on Discourse. From my experience, there are downstream users that depend on various bits here and there (not. sure about this though). In fact, I have been considering similar approach for narrow-type-emulation:
As you can see, no feedback :( (though from F2F conversation I do know that certain users do depend on that).
I am in favour of this change. If you post on Discourse and there are no objections, we should just go ahead.
As you can see, no feedback :( (though from F2F conversation I do know that certain users do depend on that).
No feedback, even with so much bold font, what chance do I have! But seriously, that sounds like a good suggestion, I'll create a New Topic over at https://discourse.llvm.org/tag/mlir