Application of the CUDA technology to the solution of fluid dynamics problems (original) (raw)

CUDA-based linear solvers for stable fluids

2010 International Conference on Information Science and Applications, ICISA 2010, 2010

In the field of computer graphics, physically-based fluids simulations (i.e., simulations that solve the equations that govern fluids behaviour) are performed using, among others, Stam's stable fluids method. This method requires the solution of two sparse linear systems that can be solved using an iterative solver (e.g., Jacobi, Gauss-Seidel, conjugate gradient, etc.). Focusing on real-time 3D applications, we provide and analyze the performance of the parallel GPU-based (using CUDA) algorithms of the Jacobi, Gauss-Seidel, and conjugate gradient solvers.

Programming CUDA-Based GPUs to Simulate Two-Layer Shallow Water Flows

Lecture Notes in Computer Science, 2010

The two-layer shallow water system is used as the numerical model to simulate several phenomena related to geophysical flows such as the steady exchange of two different water flows, as occurs in the Strait of Gibraltar, or the tsunamis generated by underwater landslides. The numerical solution of this model for realistic domains imposes great demands of computing power and modern Graphics Processing Units (GPUs) have demonstrated to be a powerful accelerator for this kind of computationally intensive simulations. This work describes an accelerated implementation of a first order well-balanced finite volume scheme for 2D two-layer shallow water systems using GPUs supporting the CUDA (Compute Unified Device Architecture) programming model and double precision arithmetic. This implementation uses the CUDA framewok to exploit efficiently the potential fine-grain data parallelism of the numerical algorithm. Two versions of the GPU solver are implemented and studied: one using both single and double precision, and another using only double precision. Numerical experiments show the efficiency of this CUDA solver on several GPUs and a comparison with an efficient multicore CPU implementation of the solver is also reported.

Computational Fluid Dynamics and Gpus

2015

Computational Fluid Dynamics (CFD) simulations are aimed to reconstruct the reality of fluid motion and behaviour as accurately as possible, to better understand the natural phenomena under specified conditions. Ideally, computational models would need to cover different scales and geometric configurations, and the classic CFD solvers most often require long computational times to satisfy the convergence criteria. With the advent of heterogeneous compute platforms (including Graphics Processing Units GPUs), CFD algorithms can now be implemented to give results in near real-time. The current paper briefly reviews and demonstrates in a general way, two methods able to harness the power of GPUs, to speed up numerical simulations of fluid flows for industrial applications. These include the Highly Simplified Marker and Cell Method (HSMAC), and Lattice-Boltzmann Method (LBM) implemented on GPUs using OpenCL. This paper describes general capabilities for compute and graphics, and method p...

Simulation of one-layer shallow water systems on multicore and CUDA architectures

The Journal of Supercomputing, 2011

The numerical solution of shallow water systems is useful for several applications related to geophysical flows but the big dimensions of the domains suggests the use of powerful accelerators to obtain numerical results in reasonable times. This paper addresses how to speed up the numerical solution of a first order well-balanced finite volume scheme for 2D one-layer shallow water systems by using modern Graphics Processing Units (GPUs) supporting the NVIDIA CUDA programming model. An algorithm which exploits the potential data parallelism of this method is presented and implemented using the CUDA model in single and double floating point precision. Numerical experiments show the high efficiency of this CUDA solver in comparison with a CPU parallel implementation of the solver and with respect to a previously existing GPU solver based on a shading language.

Some examples of instant computations of fluid dynamics on GPU

2012

Dans ce papier, nous partageons notre retour d'expérience sur l'utilisation de GPU et GPGPU pour le calcul de mécanique des fluides bidimensionnels sur grille fine et de problèmes de transport tridimensionnels. Le choix d'une méthode appropriée à l'architecture GPU est critique pour le gain de performance. Pour nos expérimentations numériques, nous testons respectivement une approche Lattice Boltzmann (LBM) pour les équations de Navier-Stokes incompressibles, une méthode de volumes finis de type Flux Vector Splitting (FVS) pour les équations d'Euler compressibles, et une approche particulaire lagrangienne pour le transport cinétique libre.

Gpu Progress and Directions in Applied CFD

2015

Current trends in high performance computing include the use of graphics processing units (GPUs) as massivelyparallel co-processors to CPUs in order to accelerate numerical operations common to computational fluid dynamics (CFD) solvers. This paper examines GPU characteristics for various CFD methods and provides examples of current implementations for commercial CFD software. In order to increase adoption of GPUs for commercial CFD, a linear solver library called AmgX was developed by NVIDIA that offers an algebraic multigrid (AMG) solver and other features, with parallelization of all phases of AMG including hierarchy construction and ILU factorization and solve. Examples relevant to CFD practice are investigated in order to demonstrate the use of AmgX in ANSYS Fluent for industry-scale applications. Hardware system configuration is also discussed that examines directions on CFD solver development.

Linear Solvers for Stable Fluids: GPU vs CPU

2012

Fluid simulation has been an active research field in computer graphics for the last 30 years. Stam’s stable fluids method, among others, is used for solving equations that govern fluids. This method solves a sparse linear system during the diffusion and move steps, using either relaxation methods (Jacobi, Gauss-Seidel, etc), Conjugate Gradient (and its variants), or others (not subject of study in this paper). A comparative performance analysis between a parallel GPU-based (using CUDA) algorithm and a serial CPU-based algorithm, in both 2D and 3D, is given with the corresponding implementation of Jacobi (J), Gauss-Seidel (GS) and Conjugate Gradient (CG) solvers.

An MPI-CUDA implementation of an improved Roe method for two-layer shallow water systems

Journal of Parallel and Distributed Computing, 2012

The numerical solution of two-layer shallow water systems is required to simulate accurately stratified fluids, which are ubiquitous in nature: they appear in atmospheric flows, ocean currents, oil spills,. .. Moreover, the implementation of the numerical schemes to solve these models in realistic scenarios imposes huge demands of computing power. In this paper, we tackle the acceleration of these simulations in triangular meshes by exploiting the combined power of several CUDA-enabled GPUs in a GPU cluster. For that purpose, an improvement of a path conservative Roe type finite volume scheme which is specially suitable for GPU implementation is presented, and a distributed implementation of this scheme which uses CUDA and MPI to exploit the potential of a GPU cluster is developed. This implementation overlaps MPI communication with CPU-GPU memory transfers and GPU computation to increase efficiency. Several numerical experiments performed on a cluster of modern CUDA-enabled GPUs show the efficiency of the distributed solver.

Acceleration of iterative Navier-Stokes solvers on graphics processing units

International Journal of Computational Fluid Dynamics, 2013

We implemented the pressure-implicit with splitting of operators (PISO) and semi-implicit method for pressure-linked equations (SIMPLE) solvers of the Navier-Stokes equations on Fermi-class graphics processing units (GPUs) using the CUDA technology. We also introduced a new format of sparse matrices optimized for performing elementary CFD operations, like gradient or divergence discretization, on GPUs. We verified the validity of the implementation on several standard, steady and unsteady problems. Computational efficiency of the GPU implementation was examined by comparing its double precision run times with those of essentially the same algorithms implemented in OpenFOAM. The results show that a GPU (Tesla C2070) can outperform a server-class 6-core, 12-thread CPU (Intel Xeon X5670) by a factor of 4.2.