Add matrix-style preprocessing to lit to reuse tests across backends (original) (raw)
Hey everyone,
I’ve been updating a handful of codegen tests across backends and am wondering if the cross-platform testing story could be a more robust. The biggest problem here is many tests for basic functionality only exist for x86 and/or aarch64, and getting them running on other backends means a whole lot of copying and pasting.
My thought is that there could be a “source of truth” test file that lit preprocesses into a number of derived test files based on directives. Something like the following:
; located at llvm/test/CodeGen/Generic/float-artihmetic.ll
; MATRIX-RUN: llc %s -o- -mtriple=aarch64-linux | FileCheck--check-prefixes=ALL,LINUX
; MATRIX-RUN: llc %s -o- -mtriple=aarch64-darwin | FileCheck --check-prefixes=ALL,DARWIN
; ↓ an identifier can be specified for special casing
; MATRIX-RUN-X86: llc %s -o- -mtriple=x86_64-linux | FileCheck --check-prefixes=ALL,LINUX
; MATRIX-RUN-X86: llc %s -o- -mtriple=x86_64-windows | FileCheck --check-prefixes=ALL,WIN
; MATRIX-RUN-PPC: llc %s -o- -mtriple=powerpc64-linux | FileCheck --check-prefixes=ALL,LINUX
;
; MATRIX-GEN-BF16: sed 's/fTy/bfloat/g' %s
; MATRIX-GEN-F16: sed 's/fTy/half/g' %s
; MATRIX-GEN-F32: sed 's/fTy/float/g' %s
; MATRIX-GEN-F64: sed 's/fTy/double/g' %s
; MATRIX-GEN-F128: sed 's/fTy/fp128/g' %s
; MATRIX-GEN-PPC_F128: sed 's/fTy/ppc_fp128/g' %s
; MATRIX-GEN-X86_F80: sed 's/fTy/x86_fp80/g' %s
; ↑ MATRIX-GEN is optional if only a single test is needed
;
; By default, don't test ppc or x87 floats
; MATRIX-EXCLUDE: GEN-PPC_F128
; MATRIX-EXCLUDE: GEN-X86_F80
; ... but do on the relevant platforms
; MATRIX-INCLUDE: RUN-X86 + GEN-X86_F80
; MATRIX-INCLUDE: RUN-PPC + GEN-PPC_F128
define fTy @fadd(fTy %a, fTy %b) {
%res = fadd fTy %a, %b
ret float %res
}
; fsub, fmul, fdiv ...
This is loosely based on CI testing matrix. Directives are:
MATRIX-GEN-IDENTindicates a preprocessing step to do. Within a backend-specific directory, eachIDENTturns into a separate.identfile.MATRIX-RUNorMATRIX-RUN-IDENTturns into aRUNcommand (IDENTis only for inlclude/exclude)MATRIX-EXCLUDEandMATRIX-INCLUDEallow removing or adding specific combinations to the list
A new command lit generate Generic/fadd.ll generates the following files for the above example:
# Assuming lit can match triple/target args to select
# the correct directory
llvm/test/CodeGen/AArch64/xgen/float-artihmetic.f16.ll
llvm/test/CodeGen/AArch64/xgen/float-artihmetic.f32.ll
llvm/test/CodeGen/AArch64/xgen/float-artihmetic.f64.ll
llvm/test/CodeGen/AArch64/xgen/float-artihmetic.f128.ll
llvm/test/CodeGen/PowerPC/xgen/float-artihmetic.f16.ll
# ...
llvm/test/CodeGen/X86/xgen/float-artihmetic.f16.ll
llvm/test/CodeGen/X86/xgen/float-artihmetic.f32.ll
llvm/test/CodeGen/X86/xgen/float-artihmetic.f64.ll
llvm/test/CodeGen/X86/xgen/float-artihmetic.f128.ll
llvm/test/CodeGen/X86/xgen/float-artihmetic.x86_f80.ll
Each similar to:
; NOTE: Test file autogenerated by llvm-lit; do not edit non-comment lines
; Source: llvm/test/CodeGen/Generic/float-artihmetic.ll.
; RUN: llc %s -o- -mtriple=x86_64-linux | FileCheck --check-prefixes=ALL,LINUX
; RUN: llc %s -o- -mtriple=x86_64-windows | FileCheck --check-prefixes=ALL,WIN
define float @fadd(float %a, float %b) {
%res = fadd float %a, %b
ret float %res
}
; fsub, fmul, fdiv ...
At this point, the file can get either handwritten or autogenerated filecheck annotations.
Keeping the files in sync is enforced in two places:
- When
litis run on the source atllvm/test/CodeGen/Generic/float-artihmetic.ll, it runs the preprocess steps. The output gets checked against thexgen/files (ignoring non-RUNcomments) to make sure they match up. - When
litis run on axgen/file, it asserts that the there is a file inGeneric/to match.
A few advantages:
- Enabling a test for a backend is trivial: add
MATRIX-RUNand update the generated file - Similar classes of behavior can be sure to get all the same tests cases, nothing to miss in copy+paste errors
- The matrix file without filecheck comments gives a nice overview of the test cases. Some vector tests have thousands of lines to scroll through and it’s easy to lose track of the actual input IR.
- The exact test input after preprocessing is checked in. This is a concern @arsenm brought up about using
sedin tests currently.
I have to disclose that unfortunately I would not be able to work on this, just seemed like an idea worth bringing up.
Anyway, does anybody have any thoughts?