Change codegen of LLVM intrinsics to be name-based, and add llvm linkage support for bf16(xN), i1xN and x86amx by sayantn · Pull Request #140763 · rust-lang/rust (original) (raw)

This PR changes how LLVM intrinsics are codegen

Explanation of the changes

Current procedure

This is the same for all functions, LLVM intrinsics are not treated specially

#[link_name = "llvm.sqrt.f32"]
fn sqrtf32(a: f32) -> f32;
will have LLVM type simply f32 (f32) due to the Rust signature

Pros

Cons

#[link_name = "llvm.sqrt.f32"]
fn sqrtf32(a: i32) -> f32;
but the generated LLVM IR is invalid, because it has wrong signature for the intrinsic (Godbolt, adding -Zverify-llvm-ir to it will fail compilation). I would expect this code to not compile at all instead of generating invalid IR.

What this PR does

Note

This PR only focuses on non-overloaded intrinsics, overloaded can be done in a future PR

Regardless, the undermentioned functionalities work for all intrinsics

Pros

Note

I don't intend for these bypasses to be permanent (at least the bf16 and i1 ones, the x86amx bypass seems inevitable). A better approach will be introducing a bf16 type in Rust, and allowing repr(simd) with bools to get Rust-native i1xNs. These are meant to be short-time, as I mentioned, "bypass"es. They shouldn't cause any major breakage even if removed, as link_llvm_intrinsics is perma-unstable.

This PR adds bypasses for bf16 (via i16), bf16xN (via i16xN), i1xN (via iM, where M is the smallest power of 2 s.t. M >= N, unless N <= 4, where we use M = 8), and x86amx (via 8192-bit vectors). This will unblock AVX512-VP2INTERSECT, AMX and a lot of bf16 intrinsics in stdarch. This PR also automatically destructures structs if the types don't exactly match (this is required for us to start emitting hard errors on mismmatches).

Cons

Possible ways to extend this to overloaded intrinsics (future)

Parse the mangled intrinsic name to get the type parameters

LLVM has a stable mangling of intrinsic names with type parameters (in LLVMIntrinsicCopyOverloadedName2), so we can parse the name to get the type parameters, and then just do the same thing.

Pros

Cons

Use the IITDescriptor table and the Rust function signature

We can use the base name to get the IITDescriptors of the corresponding intrinsic, and then manually implement the matching logic based on the Rust signature.

Pros

Cons

These 2 approaches might give different results for same function. Let's take

#[link_name = "llvm.is.constant.bf16"] fn foo(a: u16) -> bool

The name-based approach will decide that the type parameter is bf16, and the LLVM signature is i1 (bf16) and will inject some bitcasts at callsite.
The IITDescriptor-based approach will decide that the LLVM signature is i1 (u16), and will see that the name given doesn't match the expected name (llvm.is.constant.u16), and will error out.

Other things that this PR does

Reviews are welcome, as this is my first time actually contributing to rustc

After CI is green, we would need a try build and a rustc-perf run.

@rustbot label T-compiler A-codegen A-LLVM
r? codegen