What is Compute Capability? | GPU Glossary (original) (raw)

Instructions in theParallel Thread Execution instruction set are compatible with only certain physical GPUs. The versioning system used to abstract away details of physical GPUs from the instruction set and compiler is called "Compute Capability".

Most compute capability version numbers have two components: a major version and a minor version. NVIDIA promises forward compatibility (oldPTX code runs on new GPUs) across both major and minor versions following theonion layer model.

With Hopper, NVIDIA introduced an additional version suffix, the a in 9.0a, which includes features that deviate from the onion model: their future compatibility is not guaranteed, even within major versions.

With Blackwell, NVIDIA introduced yet another version suffix, the f in10.0f, which also deviates from the onion model, and is closer toSemVer : compatibility is guaranteed across minor versions but not major versions.

Target compute capabilities forPTX compilation can be specified when invoking nvcc, theNVIDIA CUDA Compiler Driver . By default, the compiler will also generate optimizedSASS for the matchingStreaming Multiprocessor (SM) architecture . Thedocumentation for nvcc refers to compute capability as a "virtual GPU architecture", in contrast to the "physical GPU architecture" expressed by the SM version.

The technical specifications for each compute capability version can be found in theCompute Capability section of the NVIDIA CUDA C Programming Guide .