GitHub - NVIDIA/NVPLSamples: NVIDIA Performance Libraries: Sample code (original) (raw)
NVPL Samples
The NVIDIA Performance Libraries (NVPL) are a collection of high performance mathematical libraries optimized for the NVIDIA Grace Armv9.0-A Neoverse-V2 architecture.
These CPU-only libraries have no dependencies on CUDA or CTK, and are drop in replacements for standard C and Fortran mathematical APIs allowing HPC applications to achieve maximum performance on the Grace platform.
The provided sample codes show how to call and link to NVPL Libraries in Fortran, C, and C++ applications and libraries. Most examples use CMake, but are easily modified for use in custom build environments.
Installation
- NVPL Downloads
- Latest release: NVPL-25.1
Library Samples
Samples are compatible with the latest nvpl release. Compatibility with older releases is not guaranteed.
- NVPL BLAS Samples
- NVPL FFT Samples
- NVPL LAPACK Samples
- NVPL RAND Samples
- NVPL ScaLAPACK Samples
- NVPL Sparse Samples
- NVPL Tensor Samples
Support
Systems
- Architecture: aarch64-linux
- Platform: Arm SBSA
- CPUs Supported
- NVIDIA Grace (Armv9.0-A Neoverse-V2)
- AWS Graviton 4 (Armv9.00-A Neoverse-V2)
- AWS Graviton 3/3e (Armv8.4-A Neoverse-V1)
- AWS Graviton 2 (Arm-8.2-A Neoverse-N1)
- Ampere Altra (Armv8.2-A Neoverse-N1)
- Any CPU with Armv8.1-A or later micro Architecture
- OS (Linux)
- Ubuntu: 20.04, 22.04, 24.04, 24.10
- Debian: 12
- RHEL: RHEL8, RHEL9
- Fedora: 39, 40, 41
- SLES: SLES15 (15.6)
- OpenSUSE/leap: 15.6
- AmazonLinux: 2, 2023
- Generally any Linux OS with support for aarch64
Compilers
- GCC-8 - GCC-14+
- Clang-14 - Clang-19+
- Clang for NVIDIA Grace: 16.x, 17.x, 18.x, 19.x
- NVIDA HPC Compilers: 23.9 - 24.11
Languages
- C: All libraries
- C++: All libraries via C interfaces
- Fortran: Selected libraries
- GFortran ABI
- NVPL BLAS, LAPACK, and ScaLAPACK provide
lp64
andilp64
integer ABIs - NVPL FFT provides FFTW Fortran '77 and '03 compatible interfaces
- See individual libraries samples documentation for further details
OpenMP
All libraries support the following OpenMP runtime libraries. See individual libraries documentation for details and API extensions supporting nested parallelism.
- GCC: libgomp.so
- Clang: libomp.so
- NVHPC: libnvomp.so
MPI
NVPL provides standard BLACS interfaces for the following MPI distributions. See the NVPL ScaLAPACK Samples Documentation for details.
- MPICH: Runtime support for
>=mpich-4.0
- OpenMPI-3.x
- OpenMPI-4.x
- OpenMPI-5.x
- NVIDIA HPC-X: Use
openmpi4
BLACS interface
CMake Usage
NVPL provides CMake Package Configfiles for the each component library.
Finding NVPL Packages
If NVPL was installed via the OS package manager under the /usr
directory, the NVPL packages will already be on the defaultCMAKE_PREFIX_PATH
. The nvpl_ROOT
environment can be used to override the default search path and force finding nvpl under a specific prefix.
Thefind_package()command is used to find nvpl and any component libraries:
Each NVPL component library found will print a brief status message with important locations.
- Variable
nvpl_FOUND
will be true if nvpl is successfully found - Variable
nvpl_VERSION
will contain the found version - Pass the
REQUIRED
keyword to raise an error ifnvpl
package is not found. - Regardless of the
COMPONENTS
keyword, all available nvpl component libraries installed in the same prefix will be found. - To raise an error if a particular component is not found, use
REQUIRED COMPONENTS ...
- Set
QUIET
to avoid printing status messages, or reporting an error if nvpl is not found find_package(nvpl)
can safely be called multiple times from different locations in a project.
Linking to NVPL Packages
The NVPL component libraries provide Imported Interface Targetsunder the common nvpl::
namespace. To add all the necessary flags to compile and link against NVPL libraries, use thetarget_link_libraries()command:
target_link_libraries(my_target PUBLIC nvpl::_)
Here <lib>
is the lowercase shorthand for the library and and <opts>
are defined by the library.
NVPL Targets
NVPL component and target names use all-lowercase naming schema. See individual libraries documentation for details on available options.
Component | Targets | Options / Notes |
---|---|---|
blas | nvpl::blas__ | : lp64, ilp64: seq, omp |
fft | nvpl::fftw | FFTW API interface |
lapack | nvpl::lapack__ | : lp64, ilp64: seq, omp |
rand | nvpl::randnvpl::rand_mt | Single-threadedMulti-threaded (OpenMP) |
scalapack | nvpl::blacs__nvpl::scalapack_ | : lp64, ilp64: mpich, openmpi3,openmpi4, openmpi5 |
sparse | nvpl::sparse | |
tensor | nvpl::tensor |
NVPL Variables
Each nvpl component library also exports variables
nvpl_<comp>_VERSION
- Version of component librarynvpl_<comp>_INCLUDE_DIR
- Full path to component headers directorynvpl_<comp>_LIBRARY_DIR
- Full path to component libraries directory
LICENSE
These Sample codes are provided under the NVIDIA Software license for NVPL SDK.