Experience in using SIMD and MIMD parallelism for computational fluid dynamics (original) (raw)

Parallelization Strategies for Computational Fluid Dynamics Software: State of the Art Review

Computational fluid dynamics (CFD) is one of the most emerging fields of fluid mechanics used to analyze fluid flow situation. This analysis is based on simulations carried out on computing machines. For complex configurations, the grid points are so large that the computational time required to obtain the results are very high. Parallel computing is adopted to reduce the computational time of CFD by utilizing the available resource of computing. Parallel computing tools like OpenMP, MPI, CUDA, combination of these and few others are used to achieve parallelization of CFD software. This article provides a comprehensive state of the art review of important CFD areas and parallelization strategies for the related software. Issues related to the computational time complexities and parallelization of CFD software are highlighted. Benefits and issues of using various parallel computing tools for parallelization of CFD software are briefed. Open areas of CFD where parallelization is not much attempted are identified and parallel computing tools which can be useful for parallelization of CFD software are spotlighted. Few suggestions for future work in parallel computing of CFD software are also provided.

Performance Studies of the Parallelization of a CFD Solver on the Origin 2000

1998

Weapon designers have typically run large-scale, computationally intensive numerical simulations for missile and projectile simulations on high-end supercomputing architectures. Recently, the comparable sustained performanceto-price ratio of scalar microprocessor-based architectures, relative to vector processors, has resulted in their purchase and utilization by the scientific community.

Experience with Massive Parallelism for CFD Applications at NASA Ames Research Center

Informatik aktuell, 1992

Eines der Hauptziele des Applied Research Branch in der Numerical Aerodynamic Simulation (NAS) Systems Division am NASA Ames Research Center ist die beschleunigte Einfiihrung von parallelen Hochleistungsrechnern in ein produktionsorientiertes Rechenzentrum. In dieser Arbeit werden die Zielrichtungen des NAS Projekts in Bezug auf Parallelrechner dargestellt. Weiterhin werden die Erfahrungen mit experimentellen Parallelrechnern im NAS Applied Research Branch zusammengefasst. 1m Einzelnen wird iiber Ergebnisse mit Anwendungen in der Stromungsmechanik auf der Connection Machine CM-2 und dem Intel iPSC/860 berichtet. Ergebnisse von Berechnungen mit unstrukturierten Gittern und mit Teilchensimulationen werden dargestellt. Angesichts der Erfahrungen bei NASA wird die zunkiinftige Entwicklung von Parallelrechnern fiir die Anwendungen in der Stromungsl1lechanik diskutiert.

Discussion of the NAS Parallel Benchmark for CFD

1994

Abstract The Numerical Aerodynamics Simulation (NAS) group at NASA Ames has developed a" pencil and paper" benchmark for Computational Fluid Dynamics (CFD) Applications. A set of synthetic Partial Di erential Equations (PDE's) and the solution methodology, embodying many salient features of a typical application code, are specified. In the benchmark specification, the derivation of the discretized equations and the solution algorithm are not considered.

Analysis and implementation of a parallelization strategy on a Navier–Stokes solver for shear flow simulations

Parallel Computing, 2001

A parallel computational solver for the unsteady incompressible three-dimensional Navier± Stokes equations implemented for the numerical simulation of shear¯ow cases is presented. The computational algorithms include Fourier expansions in the streamwise and spanwise directions, second-order centered ®nite dierences in the direction orthogonal to the solid walls, third-order Runge±Kutta procedure in time in which both convective and diusive terms are treated explicitly; the fractional step method is used for time marching. Based on the numerical algorithms implemented within the computational solver, three dierent (MPI based) parallelization strategies are devised. The three schemes are evaluated with particular attention to the impact of the communications onto the whole computational procedure, and one of them is implemented. Computations are executed on two dierent parallel machines and results are shown in terms of parallel performance. Processes using dierent number of processors combined with dierent number of computational grid points are tested.

Design of Large Scale Parallel Simulations A Case Study

2000

We present an overview of the design of software packages called {\em Particle Movers} that have been developed to simulate the motion of particles in two and three dimensional domains. These simulations require the solution of nonlinear Navier−Stokes equations for fluids coupled with Newton's equations for particle dynamics. Furthermore, realistic simulations are extremely computationally intensive, and are feasible only with algorithms that can exploit parallelism effectively. We describe the computational structure of the simulation as well as the data objects required in these packages. We present a brief description of a particle mover code in this framework, and concentrating on the following features: design modularity, portability, extensibility, and parallelism. Simulations on the SGI Origin2000 demonstrate very good speedup on a large number of processors. Overview The goal of our KDI effort is to develop high−performance, state−of−the−art software packages called Particle Movers that are capable of simulating the motion of thousands of particles in two−dimensions and hundreds in three−dimensions. Such large scale simulations will then be used to elucidate the fundamental dynamics of particulate flows and solve problems of engineering interest. The development methodology must encompass all aspects of the problem, from computational modeling for simulations in Newtonian fluids that are governed by the Navier−Stokes equations as well as in several popular models of viscoelastic fluids, to incorporation of novel preconditioners and solvers for the nonlinear algebraic equations which ultimately result. The code must, on the one hand, be high−level, modular, and portable, while at the same time highly efficient and optimized for a target architecture. We will present a design model for large scale parallel CFD simulation, as well as the PM code developed for our Grand Challenge project which adheres to this model. It is a true distributed memory implementation of the prototype set forth, and demonstrates very good scalability and speedup on the Origin2000 to a maximum test problem size of over half a million unknowns. Its modular design is based upon the GVec package[14] for Petsc which will also be discussed, as it forms the basis for all abstractions in the code. PM has been ported to the Sun SPARC and Intel Pentium running Solaris, IBM SP2 running AIX, Origin2000 running IRIX, and Cray T3E running Unicos with no explicit code modification. The code has proved to be easily extensible, for

Comparing Performance of Parallelizing Frameworks for Grid-Based Fluid Simulation on the CPU

Proceedings of the 8th Annual ACM India Conference, 2015

In this paper we present a comparison study of two widely used parallelizing frameworks on the CPU, namely, OpenMP and Intel Threading Building Blocks (TBB). The particular problem domain we apply to is a grid-based fluid simulation solver. The standard Eulerian grid-based fluid solver discretizes the Navier-Stokes equation on a staggered but regular grid and computes the fluid parameters like velocity and pressure in each grid cell. We use OpenMP and TBB to parallelize this computation, and study the behaviour of our implementation on each framework, while working with different number of threads and CPU cores. We provide arguments in support of implementing a mixed solution strategy using both the parallelizing frameworks together, thus improving performance over when either is used in isolation.

Finite difference simulations of the Navier-Stokes equations using parallel distributed computing

Proceedings. 15th Symposium on Computer Architecture and High Performance Computing, 2003

This paper discusses the implementation of a numerical algorithm for simulating incompressible fluid flows based on the finite difference method and designed for parallel computing platforms with distributed-memory, particularly for clusters of workstations. The solution algorithm for the Navier-Stokes equations utilizes an explicit scheme for pressure and an implicit scheme for velocities, i. e., the velocity field at a new time step can be computed once the corresponding pressure is known. The parallel implementation is based on domain decomposition, where the original calculation domain is decomposed into several blocks, each of which given to a separate processing node. All nodes then execute computations in parallel, each node on its associated sub-domain. The parallel computations include initialization, coefficient generation, linear solution on the subdomain, and inter-node communication. The exchange of information across the sub-domains, or processors, is achieved using the message passing interface standard, MPI. The use of MPI ensures portability across different computing platforms ranging from massively parallel machines to clusters of workstations. The execution time and speed-up are evaluated through comparing the performance of different numbers of processors. The results indicate that the parallel code can significantly improve prediction capability and efficiency for large-scale simulations.

CFD Parallel Simulation Using Getfem++ and Mumps

Euro-Par 2010 - Parallel Processing, 2010

We consider the finite element environment Getfem++ 1 , which is a C++ library of generic finite element functionalities and allows for parallel distributed data manipulation and assembly. For the solution of the large sparse linear systems arising from the finite element assembly, we consider the multifrontal massively parallel solver package Mumps 2 , which implements a parallel distributed LU factorization of large sparse matrices. In this work, we present the integration of the Mumps package into Getfem++ that provides a complete and generic parallel distributed chain from the finite element discretization to the solution of the PDE problems. We consider the parallel simulation of the transition to turbulence of a flow around a circular cylinder using Navier Stokes equations, where the nonlinear term is semi-implicit and requires that some of the discretized differential operators be updated and with an assembly process at each time step. The preliminary parallel experiments using this new combination of Getfem++ and Mumps are presented.