[llvm-dev] [RFC] The `implements` attribute, or how to swap functions statically but late (original) (raw)

Johannes Doerfert via llvm-dev llvm-dev at lists.llvm.org
Fri Mar 19 17:10:36 PDT 2021

Previous message: [llvm-dev] [cfe-dev] Zero length function pointer equality
Next message: [llvm-dev] [RFC] The `implements` attribute, or how to swap functions statically but late
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

The basic idea is to provide a way to replace a callee statically inside the LLVM-core pipeline [0]. The initial use case was the missing translation of certain (math) intrinsics by the NVPTX backend, however, we believe there are other use cases that would benefit from a generic model, one is described below as well as the benefits over alternatives.

Use cases

1a) NVPTX + fast-math + math functions

The original use case was discussed here [1] and the problem is shown here [2]. In a nutshell, llvm.sin (and friends) are not supported by the backend and it will yell at you. This is unfortunate but could arguably be resolved in a lot of different ways (below). Note that there is not necessarily a 1:1 mapping for all intrinsics and we might need to write a wrapper which would probably be in clang/lib/Header with other CUDA wrapper code.

Considered solutions we did not pick: a) Teach the backend explicitly about the mapping llvm.sin -> __nv_sin. b) Teach clang not to emit the intrinsics in the first place. c) Use an existing overload "hack", e.g., asm.

Let's discuss those first. a) Works, is very limited to this use case, the mapping is not with the declarations and definitions which makes the maintenance less appealing. b) Seems easy enough, we loose all the benefits of intrinsics though, more on that later. c) Right now, IR isn't happy with defining intrinsics so there is little we could do better than b) here I think, short of allowing intrinsic definitions (which is an option).

1b) User can overwrite the implementation of any intrinsic, regardless if the backend supports it or not. They still get the intrinsic benefits during the compilation. This is not limited to NVPTX and math intrinsics but also opens the door for cross platform tooling, e.g., Tool XYZ will emit intrinsics our backed doesn't support but we can map them to implementations for our architecture.

1c) You can prototype/test different lowerings for functions and intrinsics easily by linking in a bitcode file (which can be compiled form C*).

Function specifications and implementations

I would like to embed a function specification with the implementation, e.g.

void vec_add(double *A, double *B, int N) {
   for (int i = 0; i < N; ++i)
     A[i] += B[i];
}
__attribute__((implements("vec_add")))
void vec_add_impl(double *A, double *B, int N) {
   #pragma omp parallel for
for (int i = 0; i < N; ++i)
     A[i] += B[i];
}

I think that would open up cool possibilities: A) Replace only the first N specifications with implementations, allows to improve debugging. B) Use the specification for the first part of the optimization pipeline so you can derive facts the implementation "hides". E.g., llvm.sin calls could be annotated as readnone while the __nv_sin call contains inline assembly and acts as a optimization barrier. Or, in the above example, it is obvious from the "specification" that B is readonly but the implementation could arbitrarily hide this, e.g., the implementation might be a runtime call. C) Program verification, test synthesis, etc. would be possible on a new level in a defined way. By keeping implementation and specification close together and written in the same language we might get somewhere in this area.

Thoughts

I initially thought about the reverse attribute, so vec_add in the above example would have __attribute__((specifies(vec_add_impl)) for the same reasons (A-C). The problem is that I'm not sure how to attach anything to an intrinsic. Some __asm__ trickery might help but it's unclear that this is better than the implements version. Maybe we want both for convenience with the implements version being able to handle intrinsics.

~ Johannes

[0] https://reviews.llvm.org/D98516 [1]https://lists.llvm.org/pipermail/llvm-dev/2021-March/149117.html [2] https://godbolt.org/z/PxsEWs

-- ─────────────────── ∽ Johannes (he/his)

Previous message: [llvm-dev] [cfe-dev] Zero length function pointer equality
Next message: [llvm-dev] [RFC] The `implements` attribute, or how to swap functions statically but late
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the llvm-dev mailing list

[llvm-dev] [RFC] The implements attribute, or how to swap functions statically but late (original) (raw)

[llvm-dev] [RFC] The `implements` attribute, or how to swap functions statically but late (original) (raw)