Accelerated high-performance computing through efficient multi-process GPU resource sharing

The HPC field is witnessing a widespread adoption of GPUs as accelerators for traditional homogeneous HPC systems. One of the prevalent parallel programming models is the SPMD paradigm, which has been adapted for GPU-based parallel processing. Since each process executes the same program under SPMD, every process mapped to a CPU core also needs the GPU availability. Therefore SPMD demands a symmetric CPU/GPU distribution. However, since modern HPC systems feature a large number of CPU cores that outnumber the number of GPUs, computing resources are generally underutilized with SPMD. Our previous efforts have focused on GPU virtualization that enables efficient sharing of GPU among multiple CPU processes. Nevertheless, a formal method to evaluate and choose the appropriate GPU sharing approach is still lacking. In this paper, based on SPMD GPU kernel profiles, we propose different multi-process GPU sharing scenarios under virtualization. We introduce an analytical model that captures these sharing scenarios and provides a theoretical performance gain estimation. Benchmarks validate our analyses and achievable performance gains. While our analytical study provides a suitable theoretical foundation for GPU sharing, the experimental results demonstrate that GPU virtualization affords significant performance improvements over the nonvirtualized solutions for all proposed sharing scenarios.

Accelerated high-performance computing through efficient multi-process GPU resource sharing (original) (raw)