1. Why CUDA Compatibility — CUDA Compatibility r555 documentation (original) (raw)

CUDA Compatibility

CUDA Compatibility describes the use of new CUDA toolkit components on systems with older base installations.

The NVIDIA® CUDA® Toolkit enables developers to build NVIDIA GPU accelerated compute applications for desktop computers, enterprise, and data centers to hyperscalers. It consists of the CUDA compiler toolchain including the CUDA runtime (cudart) and various CUDA libraries and tools. To build an application, a developer has to install only the CUDA Toolkit and necessary libraries required for linking.

In order to run a CUDA application, the system should have a CUDA enabled GPU and an NVIDIA display driver that is compatible with the CUDA Toolkit that was used to build the application itself. If the application relies on dynamic linking for libraries, then the system should have the right version of such libraries as well.

Components of CUDA

Figure 1 Components of CUDA

Every CUDA toolkit also ships with an NVIDIA display driver package for convenience. This driver supports all the features introduced in that version of the CUDA Toolkit. Please check the toolkit and driver version mapping in the release notes. The driver package includes both the user mode CUDA driver (libcuda.so) and kernel mode components necessary to run the application.

Typically, upgrading a CUDA Toolkit involves upgrading both the toolkit and the driver to get the bleeding edge toolkit and driver capabilities.

CUDA Upgrade Path

Figure 2 CUDA Upgrade Path

But this is not always required. CUDA Compatibility guarantees allow for upgrading only certain components and that will be the focus of the rest of this document. We will see how the upgrade to a new CUDA Toolkit can be simplified to not always require a full system upgrade.

2. Minor Version Compatibility

2.1. CUDA 11 and Later Defaults to Minor Version Compatibility

From CUDA 11 onwards, applications compiled with a CUDA Toolkit release from within a CUDA major release family can run, with limited feature-set, on systems having at least the minimum required driver version as indicated below. This minimum required driver can be different from the driver packaged with the CUDA Toolkit but should belong to the same major release.

Refer to the CUDA Toolkit Release Notes for the complete table.

Table 1 Example CUDA Toolkit 11.x and 12.x Minimum Required Driver Versions (Refer to CUDA Release Notes)

CUDA Toolkit Linux x86_64 Minimum Required Driver Version Windows Minimum Required Driver Version
CUDA 12.x >=525.60.13 >=527.41
CUDA 11.x >= 450.80.02* >=452.39*

While applications built against any of the older CUDA Toolkits always continued to function on newer drivers due to binary backward compatibility, before CUDA 11, applications built against newer CUDA Toolkit releases were not supported on older drivers without forward compatibility package.

If you are using a new CUDA 10.x minor release, then the minimum required driver version is the same as the driver that’s packaged as part of that toolkit release. Consequently, the minimum required driver version changed for every new CUDA Toolkit minor release until CUDA 11.1. Therefore, system administrators always have to upgrade drivers in order to support applications built against CUDA Toolkits from 10.x releases.

Table 2 CUDA Toolkit 10.x Minimum Required Driver Versions

CUDA Toolkit Linux x86_64 Minimum Required Driver Version Windows Minimum Required Driver Version
CUDA 10.2 >= 440.33 >=441.22
CUDA 10.1 >= 418.39 >=418.96
CUDA 10.0 >= 410.48 >=411.31

With minor version compatibility, upgrading to CUDA 11.1 is now possible on older drivers from within the same major release family such as 450.80.02 that was shipped with CUDA 11.0, as shown below:

Minimum required driver version guidance can be found in the CUDA Toolkit Release Notes. Note that if the minimum required driver version is not installed in the system, applications will return an error as shown below.

2.2. Application Considerations for Minor Version Compatibility

Developers and system admins should note two important caveats when relying on minor version compatibility. If either of these caveats are limiting, then a new CUDA driver from the same minor version of the toolkit that the application was built with or later is required.

2.3. Deployment Considerations for Minor Version Compatibility

As described, applications that directly rely only on the CUDA runtime can be deployed in the following two scenarios:

CUDA driver that’s installed on the system is newer than the runtime. CUDA runtime is newer than the CUDA driver on the system but they are from the same major release of CUDA Toolkit.

In scenario 2, system admins should be aware of the aforementioned limitations and should be able to tell why an application may be failing if they run into any issues.

Minor version compatibility has another benefit that offers flexibility in the use and deployment of libraries. Applications that use libraries that support minor version compatibility can be deployed on systems with a different version of the toolkit and libraries without recompiling the application for the difference in the library version. This holds true for both older and newer versions of the libraries provided they are all from the same major release family. Note that libraries themselves have interdependencies that should be considered. For example, each cuDNN version requires a certain version of cuBLAS.

NVRTC supports minor version compatibility from CUDA 11.3 onwards

Figure 3 NVRTC supports minor version compatibility from CUDA 11.3 onwards

However, if an application is unable to leverage the minor version compatibility due to any of the aforementioned reasons, then the Forward Compatibility model can be used as an alternative even though Forward Compatibility is mainly intended for compatibility across major toolkit versions.

3. Forward Compatibility

3.1. Forward Compatibility Support Across Major Toolkit Versions

Increasingly, data centers and enterprises may not want to update the NVIDIA GPU Driver across major release versions due to the rigorous testing and validation that happens before any system level driver installations are done.

To support such scenarios, CUDA introduced a Forward Compatibility Upgrade path in CUDA 10.0.

Forward Compatibility Upgrade Path

Figure 4 Forward Compatibility Upgrade Path

Forward Compatibility is applicable only for systems with NVIDIA Data Center GPUs or select NGC Server Ready SKUs of RTX cards. It’s mainly intended to support applications built on newer CUDA Toolkits to run on systems installed with an older NVIDIA Linux GPU driver from different major release families. This new forward-compatible upgrade path requires the use of a special package called “CUDA compat package”.

3.2. Installing the Forward Compatibility Package

3.2.1. From Network Repositories or Local Installers

The CUDA compat package is available in the local installers or the CUDA network repositories provided by NVIDIA as cuda-compat-12.4.

Install the package on the system using the package installer.

On Ubuntu, for example:

The compat package will then be installed to the versioned toolkit location typically found in the toolkit directory. For example, for 12.5 it will be found in /usr/local/cuda-12.5/.

The cuda-compat package consists of the following files:

These files should be kept together as the CUDA driver is dependent on the libnvidia-ptxjitcompiler.so.* of the same version.

Note

This package only provides the files, and does not configure the system.

Example:

CUDA Compatibility is installed and the application can now run successfully as shown below. In this example, the user sets LD_LIBRARY_PATH to include the files installed by the cuda-compat-12-1 package.

Check the files installed under /usr/local/cuda/compat:

The user can set LD_LIBRARY_PATH to include the files installed before running the CUDA 12.1 application:

3.2.2. Manually Installing from Runfile

The cuda-compat package files can also be extracted from the appropriate datacenter driver ‘runfile’ installers (.run) available in NVIDIA driver downloads. To do this:

  1. Download the latest NVIDIA Data Center GPU driver , and extract the .run file using option -x.
  2. Copy the four CUDA compatibility upgrade files, listed at the start of this section, into a user- or root-created directory.
  3. Follow your system’s guidelines for making sure that the system linker picks up the new libraries.

Note

Symlinks under /usr/local/cuda/compat need to be created manually when using the runfile installer.

3.3. Deployment Considerations for Forward Compatibility

3.3.1. Use the Right Compat Package

CUDA forward compat packages should be used only in the following situations when forward compatibility is required across major releases.

The CUDA compat package is named after the highest toolkit that it can support. If you are on the R470 driver but require 12.5 application support, please install the cuda-compat package for 12.5. But when performing a full system upgrade, when choosing to install both the toolkit and the driver, remove any forward compatible packages present in the system.

For example, if you are upgrading the driver to 525.60.13 which is the minimum required driver version for the 12.x toolkits, then the cuda-compat package is not required in most cases. 11.x and 12.x applications will be supported due to backward compatibility and future 12.x applications will be supported due to minor-version compatibility.

But there are feature restrictions that may make this option less desirable for your scenario - for example: Applications requiring PTX JIT compilation support. Unlike the minor-version compatibility that is defined between CUDA runtime and CUDA driver, forward compatibility is defined between the kernel driver and the CUDA driver, and hence such restrictions do not apply. In order to circumvent the limitation, a forward compatibility package may be used in such scenarios as well.

Table 3 CUDA Application Compatibility Support Matrix

NVIDIA Kernel Mode Driver - Production Branch
CUDA Forward Compatible Upgrade 470.57.02+ (CUDA 11.4) 530.30.02+ (CUDA 12.1) 535.54.03+ (CUDA 12.2) 545.23.06+ (CUDA 12.3) 550.54.14+ (CUDA 12.4) 555.42.02+ (CUDA 12.5) 560.28.03+ (CUDA 12.6)
12-6 C X C X C X Not required
12-5 C C C X
12-4 C C Not required X
12-3 C C X X
12-2 C Not required X X
12-1 C X X X
12-0 C X X X
11-8 C X X X
11-7 C X X X
11-6 C X X X
11-5 C X X X
11-4 Not required X X X

Examples of how to read this table:

3.3.2. Feature Exceptions

There are specific features in the CUDA driver that require kernel-mode support and will only work with a newer kernel mode driver. A few features depend on other user-mode components and are therefore also unsupported.

Table 4 Forward-Compatible Feature-Driver Support Matrix

CUDA Forward Compatible Upgrade CUDA - OpenGL/Vulkan Interop cuMemMap* set of functionalities
System Base Installation: 525 (>=.60.04) Driver
12-x No Yes [1]
System Base Installation: 450 (>=.80.02) Driver
11-x No Yes [1]

[1] This relies on CU_DEVICE_ATTRIBUTE_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR_SUPPORTED and CU_DEVICE_ATTRIBUTE_VIRTUAL_ADDRESS_MANAGEMENT_SUPPORTED, which should be queried if you intend to use the full range of this functionality.

[2] Supported on Red Hat Enterprise Linux operating system version 8.1 or higher.

3.3.3. Check for Compatibility Support

In addition to the CUDA driver and certain compiler components, there are other drivers in the system installation stack (for example, OpenCL) that remain on the old version. The forward-compatible upgrade path is for CUDA only.

A well-written application should use following error codes to determine if CUDA Forward Compatible Upgrade is supported. System administrators should be aware of these error codes to determine if there are errors in the deployment.

3.4. Deployment Model for Forward Compatibility

There are two models of deployment for the CUDA compat package. We recommend the use of the ‘shared’ deployment mode.

4. Conclusion

The CUDA driver maintains backward compatibility to continue support of applications built on older toolkits. Using a compatible minor driver version, applications build on CUDA Toolkit 11 and newer are supported on any driver from within the corresponding major release. Using the CUDA Forward Compatibility package, system administrators can run applications built using a newer toolkit even when an older driver that does not satisfy the minimum required driver version is installed on the system. This forward compatibility allows the CUDA deployments in data centers and enterprises to benefit from the faster release cadence and the latest features and performance of CUDA Toolkit.

CUDA compatibility helps users by:

5. Frequently Asked Questions

This section includes some FAQs related to CUDA compatibility.

6. Notices

This document is provided for information purposes only and shall not be regarded as a warranty of a certain functionality, condition, or quality of a product. NVIDIA Corporation (“NVIDIA”) makes no representations or warranties, expressed or implied, as to the accuracy or completeness of the information contained in this document and assumes no responsibility for any errors contained herein. NVIDIA shall have no liability for the consequences or use of such information or for any infringement of patents or other rights of third parties that may result from its use. This document is not a commitment to develop, release, or deliver any Material (defined below), code, or functionality.

NVIDIA reserves the right to make corrections, modifications, enhancements, improvements, and any other changes to this document, at any time without notice.

Customer should obtain the latest relevant information before placing orders and should verify that such information is current and complete.

NVIDIA products are sold subject to the NVIDIA standard terms and conditions of sale supplied at the time of order acknowledgement, unless otherwise agreed in an individual sales agreement signed by authorized representatives of NVIDIA and customer (“Terms of Sale”). NVIDIA hereby expressly objects to applying any customer general terms and conditions with regards to the purchase of the NVIDIA product referenced in this document. No contractual obligations are formed either directly or indirectly by this document.

NVIDIA products are not designed, authorized, or warranted to be suitable for use in medical, military, aircraft, space, or life support equipment, nor in applications where failure or malfunction of the NVIDIA product can reasonably be expected to result in personal injury, death, or property or environmental damage. NVIDIA accepts no liability for inclusion and/or use of NVIDIA products in such equipment or applications and therefore such inclusion and/or use is at customer’s own risk.

NVIDIA makes no representation or warranty that products based on this document will be suitable for any specified use. Testing of all parameters of each product is not necessarily performed by NVIDIA. It is customer’s sole responsibility to evaluate and determine the applicability of any information contained in this document, ensure the product is suitable and fit for the application planned by customer, and perform the necessary testing for the application in order to avoid a default of the application or the product. Weaknesses in customer’s product designs may affect the quality and reliability of the NVIDIA product and may result in additional or different conditions and/or requirements beyond those contained in this document. NVIDIA accepts no liability related to any default, damage, costs, or problem which may be based on or attributable to: (i) the use of the NVIDIA product in any manner that is contrary to this document or (ii) customer product designs.

No license, either expressed or implied, is granted under any NVIDIA patent right, copyright, or other NVIDIA intellectual property right under this document. Information published by NVIDIA regarding third-party products or services does not constitute a license from NVIDIA to use such products or services or a warranty or endorsement thereof. Use of such information may require a license from a third party under the patents or other intellectual property rights of the third party, or a license from NVIDIA under the patents or other intellectual property rights of NVIDIA.

Reproduction of information in this document is permissible only if approved in advance by NVIDIA in writing, reproduced without alteration and in full compliance with all applicable export laws and regulations, and accompanied by all associated conditions, limitations, and notices.

THIS DOCUMENT AND ALL NVIDIA DESIGN SPECIFICATIONS, REFERENCE BOARDS, FILES, DRAWINGS, DIAGNOSTICS, LISTS, AND OTHER DOCUMENTS (TOGETHER AND SEPARATELY, “MATERIALS”) ARE BEING PROVIDED “AS IS.” NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THE MATERIALS, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. TO THE EXTENT NOT PROHIBITED BY LAW, IN NO EVENT WILL NVIDIA BE LIABLE FOR ANY DAMAGES, INCLUDING WITHOUT LIMITATION ANY DIRECT, INDIRECT, SPECIAL, INCIDENTAL, PUNITIVE, OR CONSEQUENTIAL DAMAGES, HOWEVER CAUSED AND REGARDLESS OF THE THEORY OF LIABILITY, ARISING OUT OF ANY USE OF THIS DOCUMENT, EVEN IF NVIDIA HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. Notwithstanding any damages that customer might incur for any reason whatsoever, NVIDIA’s aggregate and cumulative liability towards customer for the products described herein shall be limited in accordance with the Terms of Sale for the product.

6.1. Trademarks

NVIDIA and the NVIDIA logo are trademarks or registered trademarks of NVIDIA Corporation in the U.S. and other countries. Other company and product names may be trademarks of the respective companies with which they are associated.