A Parallel Framework for Unstructured Grid Solvers (original) (raw)

MPI-Parallelization of a Structured Grid CFD Solver including an Integrated Octree Grid Generator

An existing Computational Fluid Dynamics (CFD) solver is parallelized by means of MPI. The solver includes a dynamic and adaptive grid generator for Cartesian Quadtree and Octree grids, which therefore also have to be parallelized. The grid generator generates grids fulfilling a specific set of rules, that have to be enforced also in parallel. The assembly of the large sparse matrices resulting from the implicit discretization of Navier-Stokes equations is done in parallel, as is the solving process. The parallel performance of both of these processes depends heavily on a good load balancing in order to reach satisfactory speedup. Two versions of load balancing are demonstrated, one based on block swapping, and the other by utilizing the Metis or Parmetis software packages for load balancing of graphs. Results are presented for load balancing and for the parallel speedup of solving the linear algebra system of equations.

Running unstructured grid-based CFD solvers on modern graphics hardware

International Journal for Numerical Methods in Fluids, 2011

Techniques used to implement an unstructured grid solver on modern graphics hardware are described. The three-dimensional Euler equations for inviscid, compressible flow are considered. Effective memory bandwidth is improved by reducing total global memory access and overlapping redundant computation, as well as using an appropriate numbering scheme and data layout. The applicability of per-block shared memory is also considered. The performance of the solver is demonstrated on two benchmark cases: a missile and the NACA0012 wing. For a variety of mesh sizes, an average speed-up factor of roughly 9.5x is observed over the equivalent parallelized OpenMP-code running on a quad-core CPU, and roughly 33x over the equivalent code running in serial.

Efficient parallelization of an unstructured grid solver: A memory-centric approach

the Proceedings of the International Conference …, 1999

For an unstructured grid computational fluid dynamics computation typical of many large-scale partial differential equations requiring implicit treatment, we describe coding practices that lead to high implementation efficiency for standard computational and communication kernels, in both uniprocessor and parallel senses. Moreover, a family of Newton-like preconditioned Krylov algorithms whose convergence rate degrades only slightly with increasing parallel granularity, relying primarily on sparse Jacobian-vector multiplications, can be expressed in terms of these kernels. A combination of the three (uniprocessor performance, parallel scalability, and algorithmic scalability) is required for overall high performance on the largest scale problems that a given generation of parallel platforms supports.

Parallel implicit matrix-free CFD solver using AMR grids

Journal of Physics: Conference Series, 2018

A novel parallel algorithm for the LU-SGS method with adaptive mesh refinement (AMR) is proposed. Domain decomposition and dynamic load balancing algorithms for spatial discretizations with AMR are described. For improving execution efficiency on targeted GPUaccelerated systems, corresponding coarsening/refining and memory defragmentation parallel algorithms are developed.

A MIMD implementation of a parallel Euler solver for unstructured grids

The Journal of Supercomputing, 1992

A mesh-vertex finite volume scheme for solving the F,uler equations on triangular unstructured meshes is implemented on an MIMD (multiple instruction multiple data stream) parallel computer. Various partitioning strategies for distributing the work load on to the processors are discussed. Issues pertaining to the communication costs are also addressed. Finally, the performance of this unstructured computation on the Intel iPSC/860 is compared with that of a one processor of a Cray Y-MP, and with an earlier implementation.

A Framework for Parallel Unstructured Grid Generation for Practical Aerodynamic Simulations

47th AIAA Aerospace Sciences Meeting including The New Horizons Forum and Aerospace Exposition, 2009

A framework for parallel unstructured grid generation targeting both shared memory multi-processors and distributed memory architectures is presented. The two fundamental building-blocks of the framework consist of: (1) the Advancing-Partition (AP) method used for domain decomposition and (2) the Advancing Front (AF) method used for mesh generation. Starting from the surface mesh of the computational domain, the AP method is applied recursively to generate a set of sub-domains. Next, the sub-domains are meshed in parallel using the AF method. The recursive nature of domain decomposition naturally maps to a divide-and-conquer algorithm which exhibits inherent parallelism. For the parallel implementation, the Master/Worker pattern is employed to dynamically balance the varying workloads of each task on the set of available CPUs. Performance results by this approach are presented and discussed in detail as well as future work and improvements.

An Object-Oriented Parallel Finite-Volume CFD Code

Volume 6: Turbomachinery, Parts A, B, and C, 2008

This paper concerns the parallelization and optimization of an in-house three-dimensional unstructured finite-volume computational fluid dynamics (CFD) code. It aims to highlight the use of programming techniques in order to speedup computation and minimize memory usage. The motivation for developing an inhouse solver is that commercial codes are general and sometimes simulations are not in agreement with actual phenomena. Moreover, in-house models can be developed and easily integrated to the solver. The original code was initially written in Fortran 77 though the most recent added subroutines include Fortran 90 features. Due to language restrictions and the initial project objectives, issues such as memory usage minimization were not considered. The new code uses an object-oriented paradigm aiming to enhance code reuse and increase efficiency during application development. The parallel code is fully written in Fortran 90 using MPI and hence portable to different architectures. Numerical experiments of typical 3D cases, such as flat plate with uniform incoming flow and a converging-diverging supersonic nozzle, were carried out showing good parallel efficiency. The serial version of the ported code has shown a considerable reduction on the execution time compared to the original code. Convergent solutions agree with the solution of the original code.

Design and implementation of a parallel unstructured Euler solver using software primitives

AIAA Journal, 1994

This paper is concerned with the implementation of a three-dimensional unstructured-grid Euler-solver on massively parallel distributed-memory computer architectures. The goal is to minimize solution time by achieving high computational rates with a numerically e cient algorithm. An unstructured multigrid algorithm with an edge-based data-structure has been adopted, and a number of optimizations have been devised and implemented in order to accelerate the parallel computational rates. The implementation is carried out by creating a set of software tools, which provide an interface between the parallelization issues and the sequential code, while providing a basis for future automatic run-time compilation support. Large practical unstructured grid problems are solved on the Intel iPSC/860 hypercube and Intel Touchstone Delta machine. The quantitative e ect of the various optimizations are demonstrated, and we show that the combined e ect of these optimizations leads to roughly a factor of three performance improvement. The overall solution e ciency is compared with that obtained on the CRAY-YMP vector supercomputer.

A Parallel Framework for Unstructured Grid Solvers (original) (raw)

Related papers