Clang nvlink Wrapper — Clang 21.0.0git documentation (original) (raw)

Introduction

This tools works as a wrapper around the NVIDIA nvlink linker. The purpose of this wrapper is to provide an interface similar to the ld.lld linker while still relying on NVIDIA’s proprietary linker to produce the final output.

nvlink has a number of known quirks that make it difficult to use in a unified offloading setting. For example, it does not accept .o files as they must be named .cubin. Static archives do not work, so passing a .a will provide a linker error. nvlink also does not support link time optimization and ignores many standard linker arguments. This tool works around these issues.

Usage

This tool can be used with the following options. Any arguments not intended only for the linker wrapper will be forwarded to nvlink.

OVERVIEW: A utility that wraps around the NVIDIA 'nvlink' linker. This enables static linking and LTO handling for NVPTX targets.

USAGE: clang-nvlink-wrapper [options]

OPTIONS: --arch Specify the 'sm_' name of the target architecture. --cuda-path= Set the system CUDA path --dry-run Print generated commands without running. --feature Specify the '+ptx' freature to use for LTO. -g Specify that this was a debug compile. -help-hidden Display all available options -help Display available options (--help-hidden for more) -L Add to the library search path -l Search for library -mllvm Arguments passed to LLVM, including Clang invocations, for which the '-mllvm' prefix is preserved. Use '-mllvm --help' for a list of options. -o Path to file to write output --plugin-opt=jobs= Number of LTO codegen partitions --plugin-opt=lto-partitions= Number of LTO codegen partitions --plugin-opt=O<O0, O1, O2, or O3> Optimization level for LTO --plugin-opt=thinlto Enable the thin-lto backend --plugin-opt= Arguments passed to LLVM, including Clang invocations, for which the '-mllvm' prefix is preserved. Use '-mllvm --help' for a list of options. --save-temps Save intermediate results --version Display the version number and exit -v Print verbose information

Example

This tool is intended to be invoked when targeting the NVPTX toolchain directly as a cross-compiling target. This can be used to create standalone GPU executables with normal linking semantics similar to standard compilation.

clang --target=nvptx64-nvidia-cuda -march=native -flto=full input.c