CUDA Compilers (original) (raw)

In general, check the crt/host_config.h file to find out which versions are supported. Sometimes it is possible to hack the requirements there to get some newer versions working, too :)

Thrust version can be found in $CUDA_ROOT/include/thrust/version.h.

Download Archives: https://developer.nvidia.com/cuda-toolkit-archive

Release notes for CUDA Toolkit (CTK):

Version notes Nvidia HPC SDK:

Compatibility Guarantees

Quote:

nvcc

Latest, officical Compiler requirements: http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html

CUDA version SM Arch g++ icpc pgc++ xlC MSVC clang++ Linux driver thrust note
1.0 1.0-1.1 ? ? ?
1.1 1.0-1.1 ? ? ?
2.0 1.0-1.1 ? ? ?
2.1 1.0-1.3 ? ? ?
2.3.1 1.0-1.3 ? ? ?
3.0 1.0-2.0 ? ? ?
3.1 1.0-2.0 ? ? ?
3.2 1.0-2.1 ? 11.1 ?
4.0 1.0-2.1 <=4.4 11.1 ?
4.1 1.0-2.1 <=4.5 11.1 ?
4.2 1.0-2.1 <=4.6 11.1 ?
5.0 1.0-3.? <=4.6 11.1 ? ? 1.5.3
5.5 1.0-3.? <=4.8 12.1 ? ? 1.7.0 C++11 on host side supported; ICC fixed to build 20110811
6.0 1.0-5.0 <=4.8 13.1 ? 331.62 1.7.1
6.5 1.1-5.X <=4.8 14.0 ? ? ? 1.7.2 experimenal device side C++11 support; including this version, <thrust/sort.h> skrews up __CUDA_ARCH__ (must be undefined on host); deprecation of SM 11-13 (10 removed)
7.0.17 (RC) s. below <=4.9 15.0 >=14.9 13.1.1 ? 346.29 1.8.0 first official PGI support, first xlc string found; powerpc64 w. little endian supported
7.0.27 2.0-5.X <=4.9 15.0 >=14.9 13.1.1 2010-13 346.46 1.8.1 official C++11 support on device side
7.5 <=4.9 15.0 15.4 13.1 2010-13 3.5-3.6 352.41? 1.8.2 clang (host) on linux supported, __CUDACC_VER__ macros added
7.5.18 2.0-5.X <=4.9 15.0 15.4 13.1 2010-13 352.39 1.8.2
8.0.44 2.0-6.X <=5.3 15.0(.4)-16.0 16(.3)+ 13.1(.2) 2012-15 3.8-3.9 367.48 1.8.3-patch2 sm_60 (pascal) support added
8.0.61 2.0-6.X <=5.3 15.0(.4)-17.0 16(.3)+ 13.1(.2) 2012-15 3.8-3.9 375.26 1.8.3-patch2 nvcc 8 is incompatible with std::tuple in gcc 5.4+
9.0.69 (RC) 3.0-7.0 <=5.5 (<=6) 15.0(.4)-17.0 17 13.1(.2) 2012-17 3.8-3.9 ???.?? 1.9.0-patch4 device-side C++14; __CUDACC_VER__ deprecated for __CUDACC_VER_MAJOR/MINOR/BUILD__
9.0.103 (RC) 3.0-7.0 <=5.5 (<=6) 15.0(.4)-17.0 17 13.1(.2) 2012-17 3.8-3.9 384.59 1.9.0-patch4 same as above, __CUDACC_VER__ defined as #error rendering it fully broken
9.0.176 3.0-7.0 <=5.5 (<=6) (15.0-)17.0 17.1 13.1(.5) 2012-17 (3.8-)3.9 384.81 1.9.0-patch5 same as above
9.1.85 3.0-7.2 <=5.5 (<=6) (15.0-)17.0 17.X 13.1(.6) 2012-17 (3.8-)4.0 390.46 1.9.1-patch2 math_functions.hpp moved to crt/
9.1.85.1 cuBLAS 9.1.128: Volta GEMM kernels optimized
9.1.85.2 ptxas: fix address calculations using large immediate operands
9.1.85.3 cuBLAS: fixes to GEMM optimizations for convolutional sequence to sequence (seq2seq) models.
9.0-9.1 nvcc 9.0-9.1 is incompatible with std::tuple in gcc 6+
9.2.88 3.0-7.2 <=7.3.0 (<=7) (15.0-)17.0 17-18.X 13.1(.6),16.1 2012-17 (3.8-)5.0 396.26 1.9.2 CUTLASS 1.0 added; std::tuple fixed (prior GCC 6 issues)
9.2.148 396.37 1.9.2
10.0.130 3.0-7.5 <=7 (15.0-)18.0 17-18.X 13.1, 16.1 2013-17 (3.8-)6.0 410.48 1.9.3 CUDA Forward Compatible Upgrade
10.1.105 3.0-7.5 <=8 (15.0-)19.0 17-19.X 2013-19 (3.8-)7.0 418.39 1.9.4
10.1.168 (3.8-)8.0 418.67 10.1 "Update 1"
10.1.243 418.87 10.1 "Update 2"
10.2.89 3.0-7.5 <=8 (15.0-)19.0 18-19.X 13.1, 16.1 2015-19 (3.3-)8.X 440.33.01 1.9.7 sm_30,35,37,50 deprecated; nvcc: -allow-unsupported-compiler
11.0.1 (RC) NVCC:11.0.167 3.5-8.0 (5-)6-9.* (15.0-)19.1 18-20.1 13.1, 16.1 2015-19 3.2-9.0.0 450.36.06 1.9.9 macOS dropped; libs drop pre-C++11, deprecate pre-C++14 (GCC < 5, Clang < 6, and MSVC < 2017); Arm C/C++ 19.2 support
11.0.2-1 NVCC:11.0.194 (3.3/)6-9.0.0 450.51.05 nvcc: --Wext-lambda-captures-this
11.0.3 NVCC:11.0.221 ? ? ? ? ? ? ? 450.51.06 ? 11.0 "Update 1"; nvcc: --forward-unknown-to-host-compiler, --forward-unknown-to-host-linker flags
11.1.0 NVCC:11.1.74 3.5-8.6 (5-)6-10.0 (15.0-)19.1 18-20.1 13.1, 16.1 2017-19 (3.3/)6-10.X 455.23.05 1.9.10-1 Ubuntu@ppc64le deprecated; CUDA Enhanced Compatibility
11.1.1 NVCC:11.1.? ? ? ?
11.2.0 NVCC:11.2.67 <12 460.27.04 1.10.0
11.2.1 NVCC:11.2.142 460.32.03 ? "Update 1"
11.2.2 NVCC:11.2.152 460.32.03 ? "Update 2"
11.3.0 NVCC:11.3.58 6.0-10.X 465.19.01 ? cu++flt added, Python Driver/RT bindings, alloca()
11.4.0 NVCC:11.4.48 6.0-11.X <13 470.42.01 ? sm30,32 and Ubuntu 16.04 dropped, C++11 stdlib for math
11.4.1 NVCC:11.4.100 6.0-11.X 470.57.02 ? 11.4 "Update 1", fix g++ 10 issues with chrono headers of libstdc++; Ubuntu 16.04 dropped (x86)
11.4.2 NVCC:11.4.120 3.2-12.X 470.57.02 ? ...
11.5.0 NVCC:11.5.50 6.0-11.X 3.2-12.X 495.29.05 ? ...
11.5.1 NVCC:11.5.119
11.6.0 NVCC:11.6.55 6.0-11.X adds VS2022 3.2-13.X 510.39.01 ? adds -arch=native and PTX generation in nvlink (for LTO workflows with PTX)
11.6.1 NVCC:11.6.112 510.47.03 ?
11.6.2 NVCC:11.6.124 510.47.03 ?
11.7.0 NVCC:11.7.64 ? ? ? ? 515.43.04 ?
11.7.1 NVCC:11.7.99 515.65.01 ?
11.8.0 NVCC:11.8.89 6.0-11.2.1 520.61.05 ?
12.0.0 NVCC:12.0.76 4.0-9.0 6.0-12.1 (12.2.1) 2021.6 22.7 16.1.x -VS2022 17.4 -14.X 525.60.13 2.0.1 C++20 support, Hopper and Lovelance, JIT LTO (nvJitLink lib), NVVM IR 2.0, CUDA-MEMCHECK -> Compute Sanitizer , sm_35/37 dropped in all libs, 32-bit compilation support dropped
12.3.0 NVCC:12.0.76 545.23.06 2.2.0
CUDA version SM g++ icpc pgc++ xlC MSVC clang++ Linux driver thrust note

SM: means SM architecture support.

pgc++: now NVHPC products, e.g., nvc/nvfortran/nvc++.

Note: empty cells generally mean "same as above" for readability.

macOS: As of 7.0, clang seems to be the only supported compiler on OSX (but no version check found). CUDA 10.1.243 adds support for Xcode 10.2 . CUDA 11.0 dropped macOS support.

Compilers such as pgC, icc, xlC are only supported on x86 linux and little endian.

Dynamic parallelism was added with sm_35 and CUDA 5.0.

Newer CUDA releases have a per-release support matrix for compilers, which also lists supported kernel and glibc versions: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#system-requirements

clang++ -x cuda

clang++ can compile CUDA C++ to ptx as well. Give it a whirl!

clang++ supported CUDA release supported SMs
3.9-5.0 7.0-8.0 2.0-(5.0)6.0
6.0 7.0-9.0 (2.0)3.0-7.0
7.0 7.0-9.2 (2.0)3.0-7.2
8.0 7.0-10.0 (2.0)3.0-7.5
9.0 7.0-10.1 (2.0)3.0-7.5
10.0 7.0-10.1 (2.0)3.0-7.5
11.0 7.0-11.0 (2.0)3.0-8.0
12.0 7.0-11.0 (2.0)3.0-8.0
13.0 7.0-11.2 (2.0)3.0-8.6
14.0 7.0-11.5 (2.0)3.0-8.6
15.0 7.0-11.5 (2.0)3.0-8.6
16.0 7.0-11.8 (2.0)3.0-9.0
main 7.0-12.1 (2.0)3.5-9.0

https://llvm.org/docs/CompileCudaWithLLVM.html

Device-Side C++ Standard Support

C++ core language features:

supported C++ standard notes
nvcc -6.0 c++03
nvcc 6.5 c++03, exp. c++11 undocumented
nvcc 7.0-8.0 c++03,11 only c++11 switch
nvcc 9.0-10.2 c++03,11,14 10.2 adds libcu++ (atomics); open repository: https://github.com/NVIDIA/libcudacxx/releases
nvcc 11.0.167+ c++03,11,14,17 C++11 host compiler needed for math libs; ships C++11-compatible backport of the C++20 synchronization library; device LTO added; starting with CUDA Toolkit 11.0.1, nvcc and CUDA Toolkit versions are not equivalent anymore
nvcc 12.0+ c++03,11,14,17,20
clang 5+ c++03,11,14,17
clang 6+ c++03,11,14,17,2a
clang 10+ c++03,11,14,17,20
clang 13+ c++03,11,14,17,20,2b
clang trunk c++03,11,14,17,20,2b status

CUDA-enabled C++ standard library libcu++, based on LLVM's libc++ (docs):

introduced components notes
CUDA 10.2 (SM6.0+), <type_traits> introduction of libcu++
CUDA 11.0 atomic::wait/notify, , , <counting_semaphore>(SM7.0+), , , w/o function anticipated with GTC 2020 slides
CUDA 11.2 cuda::std::tuple,pair notes
CUDA 12.0 cuda::std::barrier
CUDA next cuda::std::complex, backports: chrono, type_traits notes
newer see the release notes and api docs all open source now

Incremental libcu++ release goals (GTC 2020):

NVC++

NVC++ is a unified C++ compiler and GPU-accelerated STL for the CUDA platform. It also seems to support OpenACC. NVC++ does currently not support the CUDA C++ language.

supported C++ standard notes
nvc++ 11.0 ...,c++17 initial release, ships C++11-compatible backport of the C++20 synchronization library

All GPU compilers are cheese.