1. Why CUDA Compatibility — CUDA Compatibility (original) (raw)

CUDA Compatibility

CUDA Compatibility describes the use of new CUDA toolkit components on systems with older base installations.

The NVIDIA® CUDA® Toolkit enables developers to build NVIDIA GPU accelerated compute applications for desktop computers, enterprise, and data centers to hyperscalers. It consists of the CUDA compiler toolchain including the CUDA runtime (cudart) and various CUDA libraries and tools. To build an application, a developer has to install only the CUDA Toolkit and necessary libraries required for linking.

In order to run a CUDA application, the system should have a CUDA enabled GPU and an NVIDIA driver that is compatible with the CUDA Toolkit that was used to build the application itself. If the application relies on dynamic linking for libraries, then the system should have the right version of such libraries as well.

Components of CUDA

Figure 1 Components of CUDA#

Every CUDA toolkit also ships with an NVIDIA driver package for convenience. This driver supports all the features introduced in that version of the CUDA Toolkit. Please check the toolkit and driver version mapping in the release notes. The driver package includes both the user mode CUDA driver (libcuda.so) and kernel mode components necessary to run the application.

Typically, upgrading a CUDA Toolkit involves upgrading both the toolkit and the driver to get up to date toolkit and driver capabilities.

CUDA Upgrade Path

Figure 2 CUDA Upgrade Path#

As you can see, when the NVIDIA driver is upgraded along with the CUDA Toolkit, the CUDA Driver and CUDA Runtime versions in the sample output of deviceQuery match perfectly:

$ ./deviceQuery ./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "NVIDIA GeForce RTX 4070 SUPER" CUDA Driver Version / Runtime Version 12.8 / 12.8 CUDA Capability Major/Minor version number: 8.9 [...] deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 12.8, CUDA Runtime Version = 12.8, NumDevs = 1 Result = PASS

This is not always required. CUDA Compatibility guarantees allow for upgrading only certain components:

Applications that directly rely only on the CUDA runtime can be deployed in the following scenarios:

2. Minor Version Compatibility#

2.1. CUDA 11 and Later Defaults to Minor Version Compatibility#

From CUDA 11 onwards, applications compiled with a CUDA Toolkit release from within a CUDA major release family can run, with limited feature-set, on systems having at least the minimum required driver version as indicated below. This minimum required driver can be different from the driver packaged with the CUDA Toolkit but should belong to the same major release.

Refer to the CUDA Toolkit Release Notes for the complete table.

* CUDA 11.0 was released with an earlier driver version, but by upgrading to Tesla Recommended Drivers 450.80.02 (Linux) / 452.39 (Windows) as indicated, minor version compatibility is possible across the CUDA 11.x family of toolkits.

While applications built against any of the older CUDA Toolkits always continued to function on newer drivers due to binary backward compatibility, before CUDA 11, applications built against newer CUDA Toolkit releases were not supported on older drivers without forward compatibility package.

If you are using a new CUDA 10.x minor release, then the minimum required driver version is the same as the driver that’s packaged as part of that toolkit release. Consequently, the minimum required driver version changed for every new CUDA Toolkit minor release until CUDA 11.1. Therefore, system administrators always have to upgrade drivers in order to support applications built against CUDA Toolkits from 10.x releases.

With minor version compatibility, upgrading to CUDA 11.1 is now possible on older drivers from within the same major release family such as 450.80.02 that was shipped with CUDA 11.0, as shown below:

$ nvidia-smi +-----------------------------------------------------------------------------+ | NVIDIA-SMI 450.80.02 Driver Version: 450.80.02 CUDA Version: 11.0 | |-------------------------------+----------------------+----------------------+ [...]

$ ./deviceQuery ./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "Tesla T4" CUDA Driver Version / Runtime Version 11.0 / 11.1 CUDA Capability Major/Minor version number: 7.5 [...]

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.0, CUDA Runtime Version = 11.1, NumDevs = 1 Result = PASS

Minimum required driver version guidance can be found in the CUDA Toolkit Release Notes. Note that if the minimum required driver version is not installed in the system, applications will return an error as shown below.

$ ./deviceQuery ./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 3 -> initialization error Result = FAIL

2.2. Application Considerations for Minor Version Compatibility#

Developers and system administrators should note three important caveats when relying on minor version compatibility. If either of these caveats are limiting, then a new CUDA driver from the same minor version of the toolkit that the application was built with or later is required.

2.3. Deployment Considerations for Minor Version Compatibility#

Minor version compatibility has another benefit that offers flexibility in the use and deployment of libraries. Applications that use libraries that support minor version compatibility can be deployed on systems with a different version of the toolkit and libraries without recompiling the application for the difference in the library version. This holds true for both older and newer versions of the libraries provided they are all from the same major release family. Note that libraries themselves have interdependencies that should be considered. For example, each cuDNN version requires a certain version of cuBLAS.

NVRTC supports minor version compatibility from CUDA 11.3 onwards

Figure 3 NVRTC supports minor version compatibility from CUDA 11.3 onwards#

3. Forward Compatibility#

3.1. Forward Compatibility Support Across Major Toolkit Versions#

Increasingly, data centers and enterprises may not want to update the NVIDIA GPU Driver across major release versions due to the rigorous testing and validation that happens before any system level driver installations are done.

To support such scenarios, CUDA introduced a Forward Compatibility Upgrade path in CUDA 10.0.

Forward Compatibility Upgrade Path

Figure 4 Forward Compatibility Upgrade Path#

Warning

Forward Compatibility is applicable only for systems with:

It’s mainly intended to support applications built on newer CUDA Toolkits to run on systems installed with an older NVIDIA Linux GPU driver from different major release families. This new forward-compatible upgrade path requires the use of a special package called “CUDA Comaptibility Package”.

3.2. Installing the CUDA Forward Compatibility Package#

3.2.1. From Network Repositories or Local Installers#

The CUDA forward compatibility package is available in the local installers or the CUDA network repositories provided by NVIDIA as cuda-compat-<cuda_major_version>-<cuda_minor_version>.

Each NVIDIA driver is built with an accompanying CUDA supported release; and this package is generated from the NVIDIA driver that matches with the CUDA release for which the compatibility is requested. For example, the cuda-compat-12-8 package is built using the NVIDIA driver libraries version 570.

Install the package on the system using the package installer.

Ubuntu / Debian:

apt install cuda-compat-12-8

Red Hat Enterprise Linux / Rocky Linux / Oracle Linux / Amazon Linux / Fedora / KylinOS:

dnf install cuda-compat-12-8

SLES / OpenSUSE:

zypper install cuda-compat-12-8

Azure Linux:

tdnf install cuda-compat-12-8

The CUDA forward compatibility package will then be installed to the versioned toolkit directory. For example, for the CUDA forward compatibility package of 12.8, the GPU driver libraries of 570 will be installed in /usr/local/cuda-12.8/compat/.

The cuda-compat-<cuda_major_version>-<cuda_minor_version> package consists of the following files:

Note

This package only provides the libraries, and does not configure the system to find such libraries.

3.2.1.1. Example installation and configuration#

In this example, the user sets LD_LIBRARY_PATH to include the files installed by the cuda-compat-12-1 package on an Ubuntu system.

Install the package:

apt install -y cuda-compat-12-1

Selecting previously unselected package cuda-compat-12-1. (Reading database ... 339974 files and directories currently installed.) Preparing to unpack .../cuda-compat-12-0_530.30-1_amd64.deb ... Unpacking cuda-compat-12-1 (530.30-1) ... Setting up cuda-compat-12-1 (530.30-1) ... Processing triggers for libc-bin (2.31-0ubuntu9.2) ...

Check the files installed under /usr/local/cuda-12.1/compat:

$ ls -l /usr/local/cuda-12.1/compat total 145676 lrwxrwxrwx 1 root root 12 Jun 3 00:45 libcuda.so -> libcuda.so.1 lrwxrwxrwx 1 root root 17 Jun 3 00:45 libcuda.so.1 -> libcuda.so.530.30 -rw-r--r-- 1 root root 26255520 Jun 3 00:45 libcuda.so.530.30 lrwxrwxrwx 1 root root 25 Jun 3 00:45 libcudadebugger.so.1 -> libcudadebugger.so.530.30 -rw-r--r-- 1 root root 10938424 Jun 3 00:45 libcudadebugger.so.530.30 lrwxrwxrwx 1 root root 19 Jun 3 00:45 libnvidia-nvvm.so -> libnvidia-nvvm.so.4 lrwxrwxrwx 1 root root 24 Jun 3 00:45 libnvidia-nvvm.so.4 -> libnvidia-nvvm.so.530.30 -rw-r--r-- 1 root root 92017376 Jun 3 00:45 libnvidia-nvvm.so.530.30 lrwxrwxrwx 1 root root 34 Jun 3 00:45 libnvidia-ptxjitcompiler.so.1 -> libnvidia-ptxjitcompiler.so.530.30 -rw-r--r-- 1 root root 19951576 Jun 3 00:45 libnvidia-ptxjitcompiler.so.530.30

The user can set LD_LIBRARY_PATH to include the files installed before running the CUDA 12.1 application:

$ export LD_LIBRARY_PATH=/usr/local/cuda-12.1/compat $ samples/bin/x86_64/linux/release/deviceQuery samples/bin/x86_64/linux/release/deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "Tesla T4" CUDA Driver Version / Runtime Version 12.1 / 12.1 CUDA Capability Major/Minor version number: 9.0 [...]

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 12.1, CUDA Runtime Version = 12.1, NumDevs = 1 Result = PASS

3.2.2. Manually Installing from Runfile#

The CUDA forward compatibility package files can also be extracted from the appropriate datacenter driver ‘runfile’ installers (.run) available in NVIDIA driver downloads. To do this:

  1. Download the latest NVIDIA Data Center GPU driver, and extract the .run file using option -x.
  2. Copy the CUDA libraries, listed at the start of this section, into a user or root created directory.
  3. Follow your system’s guidelines for making sure that the system linker picks up the new libraries.

3.3. Deployment Considerations for Forward Compatibility#

3.3.1. Use the Right CUDA Forward Compatibility Package#

The CUDA forward compatibility package should be used only in the following situations when forward compatibility is required across major releases.

The CUDA forward compatibility package is named after the highest toolkit that it can support. If you are on the 535 driver but require 12.5 application support, please install the CUDA compatibility package for 12.5.

Note

When performing a full system upgrade, when choosing to install both the toolkit and the driver, remove any forward compatible packages present in the system.

For example, if you are upgrading the driver to 525.60.13 which is the minimum required driver version for the 12.x toolkits, then the CUDA compatibility package is not required in most cases. 11.x and 12.x applications will be supported due to backward compatibility and future 12.x applications will be supported due to minor-version compatibility.

But there are feature restrictions that may make this option less desirable for your scenario - for example: Applications requiring PTX JIT compilation support. Unlike the minor-version compatibility that is defined between CUDA runtime and CUDA driver, forward compatibility is defined between the kernel driver and the CUDA driver, and hence such restrictions do not apply. In order to circumvent the limitation, a forward compatibility package may be used in such scenarios as well.

Examples of how to read this table:

3.3.2. Feature Exceptions#

There are specific features in the CUDA driver that require kernel-mode support and will only work with a newer kernel mode driver. A few features depend on other user-mode components and are therefore also unsupported.

[1] This relies on CU_DEVICE_ATTRIBUTE_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR_SUPPORTED and CU_DEVICE_ATTRIBUTE_VIRTUAL_ADDRESS_MANAGEMENT_SUPPORTED, which should be queried if you intend to use the full range of this functionality.

[2] Supported on Red Hat Enterprise Linux operating system version 8.1 or higher.

3.3.3. Check for Compatibility Support#

In addition to the CUDA driver and certain compiler components, there are other drivers in the system installation stack (for example, OpenCL) that remain on the old version. The forward-compatible upgrade path is for CUDA only.

A well-written application should use following error codes to determine if a CUDA Forward Compatible Upgrade is supported. System administrators should be aware of these error codes to determine if there are errors in the deployment.

3.4. Deployment Model for Forward Compatibility#

There are two models of deployment for the CUDA compatibility package. We recommend the use of the ‘shared’ deployment mode.

4. Conclusion#

The CUDA driver maintains backward compatibility to continue support of applications built on older toolkits. Using a compatible minor driver version, applications build on CUDA Toolkit 11 and newer are supported on any driver from within the corresponding major release. Using the CUDA Forward Compatibility package, system administrators can run applications built using a newer toolkit even when an older driver that does not satisfy the minimum required driver version is installed on the system. This forward compatibility allows the CUDA deployments in data centers and enterprises to benefit from the faster release cadence and the latest features and performance of CUDA Toolkit.

CUDA compatibility helps users by:

5. Frequently Asked Questions#

This section includes some FAQs related to CUDA compatibility.

6. Notices#

This document is provided for information purposes only and shall not be regarded as a warranty of a certain functionality, condition, or quality of a product. NVIDIA Corporation (“NVIDIA”) makes no representations or warranties, expressed or implied, as to the accuracy or completeness of the information contained in this document and assumes no responsibility for any errors contained herein. NVIDIA shall have no liability for the consequences or use of such information or for any infringement of patents or other rights of third parties that may result from its use. This document is not a commitment to develop, release, or deliver any Material (defined below), code, or functionality.

NVIDIA reserves the right to make corrections, modifications, enhancements, improvements, and any other changes to this document, at any time without notice.

Customer should obtain the latest relevant information before placing orders and should verify that such information is current and complete.

NVIDIA products are sold subject to the NVIDIA standard terms and conditions of sale supplied at the time of order acknowledgement, unless otherwise agreed in an individual sales agreement signed by authorized representatives of NVIDIA and customer (“Terms of Sale”). NVIDIA hereby expressly objects to applying any customer general terms and conditions with regards to the purchase of the NVIDIA product referenced in this document. No contractual obligations are formed either directly or indirectly by this document.

NVIDIA products are not designed, authorized, or warranted to be suitable for use in medical, military, aircraft, space, or life support equipment, nor in applications where failure or malfunction of the NVIDIA product can reasonably be expected to result in personal injury, death, or property or environmental damage. NVIDIA accepts no liability for inclusion and/or use of NVIDIA products in such equipment or applications and therefore such inclusion and/or use is at customer’s own risk.

NVIDIA makes no representation or warranty that products based on this document will be suitable for any specified use. Testing of all parameters of each product is not necessarily performed by NVIDIA. It is customer’s sole responsibility to evaluate and determine the applicability of any information contained in this document, ensure the product is suitable and fit for the application planned by customer, and perform the necessary testing for the application in order to avoid a default of the application or the product. Weaknesses in customer’s product designs may affect the quality and reliability of the NVIDIA product and may result in additional or different conditions and/or requirements beyond those contained in this document. NVIDIA accepts no liability related to any default, damage, costs, or problem which may be based on or attributable to: (i) the use of the NVIDIA product in any manner that is contrary to this document or (ii) customer product designs.

No license, either expressed or implied, is granted under any NVIDIA patent right, copyright, or other NVIDIA intellectual property right under this document. Information published by NVIDIA regarding third-party products or services does not constitute a license from NVIDIA to use such products or services or a warranty or endorsement thereof. Use of such information may require a license from a third party under the patents or other intellectual property rights of the third party, or a license from NVIDIA under the patents or other intellectual property rights of NVIDIA.

Reproduction of information in this document is permissible only if approved in advance by NVIDIA in writing, reproduced without alteration and in full compliance with all applicable export laws and regulations, and accompanied by all associated conditions, limitations, and notices.

THIS DOCUMENT AND ALL NVIDIA DESIGN SPECIFICATIONS, REFERENCE BOARDS, FILES, DRAWINGS, DIAGNOSTICS, LISTS, AND OTHER DOCUMENTS (TOGETHER AND SEPARATELY, “MATERIALS”) ARE BEING PROVIDED “AS IS.” NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THE MATERIALS, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. TO THE EXTENT NOT PROHIBITED BY LAW, IN NO EVENT WILL NVIDIA BE LIABLE FOR ANY DAMAGES, INCLUDING WITHOUT LIMITATION ANY DIRECT, INDIRECT, SPECIAL, INCIDENTAL, PUNITIVE, OR CONSEQUENTIAL DAMAGES, HOWEVER CAUSED AND REGARDLESS OF THE THEORY OF LIABILITY, ARISING OUT OF ANY USE OF THIS DOCUMENT, EVEN IF NVIDIA HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. Notwithstanding any damages that customer might incur for any reason whatsoever, NVIDIA’s aggregate and cumulative liability towards customer for the products described herein shall be limited in accordance with the Terms of Sale for the product.

6.1. Trademarks#

NVIDIA and the NVIDIA logo are trademarks or registered trademarks of NVIDIA Corporation in the U.S. and other countries. Other company and product names may be trademarks of the respective companies with which they are associated.