In clang cuda compiling, Can I call gcc with host code build, and use clang+llvm with device code build? (original) (raw)

Hi,
In clang cuda compiling, Can I call gcc with host code build, and use clang+llvm with device code build?
Seems --gcc-toolchain=gcc is not designed for this.
If can not. I want to modify clang code to support this feature, which files should I start with?

jhuber6 March 14, 2025, 1:15pm 2

What’s your use-case? The host + device compilation steps cannot be so easily separated. It’s important that things like manged names match up, especially for generation of things like static device variables. The host compilation is also responsible for generating the actual runtime calls that let you call the kernels from the CUDA driver, which requires knowing what the names of those kernels are among other things.

GCC, to my knowledge, is not a CUDA compiler. It cannot compile CUDA. What you’re looking for is probably something more similar to the OpenCL workflow where you compile an image separately and register it with some runtime calls manually.

Not speaking for @joojookoo, but I suspect the use case is to use Clang in a similar way to how nvcc compiles CUDA code. nvcc drives two distinct compilers, Nvidia’s own device compiler, and a third party host compiler. Supported host compilers include GCC, MSVC, and others; see supported host compiler documentation here. Host compilers are not required to be CUDA aware; nvcc preprocesses the CUDA source code, lowers it to (standard-ish) C++, and then invokes the host compiler to compile the preprocessed output.

Use of Clang as a compiler driver for CUDA with support for GCC as a host compiler would require Clang to do something similar; presumably following the same preprocessing plus lowering to standard-ish C++ steps. Since Clang is CUDA aware, support for generating preprocessed output with CUDA language constructs lowered to standard C++ is possible, but would probably entail quite a bit of work.

jhuber6 March 14, 2025, 6:29pm 4

Clang doesn’t work the same way as nvcc in this case, see Compiling CUDA with clang — LLVM 21.0.0git documentation. nvcc basically preprocesses the CUDA and then puts a lot of handling externally. In clang all of that is contained inside the compiler and both sides see (almost) the same AST.

It’s definitely not impossible, but you would need to do all the runtime registration yourself, and I doubt it’s something clang would want to support natively.