Flow Simulation With an Adaptive Finite Element Method on Massively Parallel Systems (original) (raw)
Related papers
Flow simulation with FEM on massively parallel systems
Computational Fluid Dynamics on Parallel Systems, 1995
An explicit nite element scheme based on a two step Taylor-Galerkin algorithm allows the solution of the Euler and Navier-Stokes Equations for a wide variety of ow problems. To obtain useful results for realistic problems one has to use grids with an extremely high density to get a good resolution of the interesting parts of a given ow. Since these details are often limited to small regions of the calculation domain, it is e cient to use unstructured grids to reduce the number of elements and grid points. As such calculations are very time consuming and inherently parallel the use of multiprocessor systems for this task seems to be a very natural idea. A common approach for parallelization is the division of a given grid, where the problem is the increasing complexity of this task for growing processor numbers. Here we present some general ideas for this kind of parallelization and details of a Parix implementation for Transputer networks. Results for up to 1024 processors show the general suitability of our approach for massively parallel systems.
Parallel and Adaptive Finite Element Techniques for Flow Simulation
NOTES ON …, 2004
We present the software library MG which provides an interface for the parallel implementation of adaptive flow solvers using unstructured grids. The excellent scalability on distributed memory architectures is demonstrated. Current applications include finite element solvers for incompressible as well as compressible flows.
32nd Aerospace Sciences Meeting and Exhibit
A breakthrough in computer performance is possible, as has been demonstrated in the last years [I], by using the massively parallel machines 121. Massively parallel computers use very large number of processors operating simultaneously in either SIMD (Single Instruction Multiple Data), MIMD (Multiple Instruction Multiple Data) or a combination of the two. The availability of massively parallel machines in the market created a need for software which is capable of taking advantage of the new technology. Several problems arise when an efficient implementation on a massively parallel machine is sought. The most time consuming part of a massively parallel computation is the interprocessor communication rather than floating point operations [3]. An effort must be directed, thus, to more efficient communication. An efficient treatment of boundary conditions is mandatory in massively parallel applications. Typically only a small portion of the 4
Distributed parallel processing applied to an implicit multigrid Euler/Navier-Stokes algorithm
31st Aerospace Sciences Meeting, 1993
An implicit multigrid algorithm for the solution of the Euler and Navier-Stokes Equations has been implemented within the framework of multiple blockstructured grids in which the physical domain is spatially decomposed into several blodrs and the solution is advanced in parallel on each block. Utilities have been developed to implement such a scheme in a d i e tributed computing environment. The multi-block algorithm is designed so that the explicit residual calculation is identical to that of the single-block scheme, and therefore converged solutions for both schemes must be the same. To accelerate convergence, synchronous and asynchronous multigrid strategies are implemented. Significant speedups have been achieved in a multiple processor environment, while convergence rates similar to those of the singleblock scheme are observed. With the recent advances in computer architecturespecifically the availability of low-cost high-speed cornputer workstations, the development of multiple prople processor IBM ES/3090 6OOJ supercomputer us-*ReMarch Scientist. Member AIAA. 'Professor, Sibley School of Mechanicaland Aerospace En& '%a' neering. Associate Fellow AIAA. Copyright A~~, +~~ h t i t u l e of Aeranautics and Astronsutics, Inc. AU rights merved.
CFD with adaptive FEM on massively parallel systems
Notes on Numerical Fluid Mechanics (NNFM), 1996
An explicit nite element scheme based on a two-step Taylor-Galerkin algorithm allows the solution of the Euler and Navier-Stokes equations for a wide variety of ow problems. To obtain useful results for realistic problems, one has to use grids with an extremely high density to obtain a good resolution of the interesting parts of a given ow. Since these details are often limited to small regions of the calculation domain, it is e cient to use unstructured grids to reduce the number of elements and grid points. As such calculations are very time consuming and inherently parallel, the use of multiprocessor systems for this task seems to be a very natural idea. A common approach for parallelization is the division of a given grid, where the problem is the increasing complexity of this task for growing processor numbers. Some general ideas for this kind of parallelization and details of a Parix implementation for Transputer networks are presented. To improve the quality of the calculated solutions, an adaptive grid re nement procedure was included. This extension leads to the need for a dynamic load balancing for the parallel version. An e ective strategy for this task is presented and results for up to 1024 processors show the general suitability of this approach for massively parallel systems.
Parallel Computing, 2001
A parallel computational solver for the unsteady incompressible three-dimensional Navier± Stokes equations implemented for the numerical simulation of shear¯ow cases is presented. The computational algorithms include Fourier expansions in the streamwise and spanwise directions, second-order centered ®nite dierences in the direction orthogonal to the solid walls, third-order Runge±Kutta procedure in time in which both convective and diusive terms are treated explicitly; the fractional step method is used for time marching. Based on the numerical algorithms implemented within the computational solver, three dierent (MPI based) parallelization strategies are devised. The three schemes are evaluated with particular attention to the impact of the communications onto the whole computational procedure, and one of them is implemented. Computations are executed on two dierent parallel machines and results are shown in terms of parallel performance. Processes using dierent number of processors combined with dierent number of computational grid points are tested.
A space-time parallel algorithm with adaptive mesh refinement for computational fluid dynamics
Computing and Visualization in Science, 2020
This paper describes a space-time parallel algorithm with space-time adaptive mesh refinement (AMR). AMR with subcycling is added to multigrid reduction-in-time (MGRIT) in order to provide solution efficient adaptive grids with a reduction in work performed on coarser grids. This algorithm is achieved by integrating two software libraries: XBraid (Parallel time integration with multigrid. https://computation.llnl.gov/projects/parallel-timeintegration-multigrid) and Chombo (Chombo software package for AMR applications-design document, 2014). The former is a parallel time integration library using multigrid and the latter is a massively parallel structured AMR library. Employing this adaptive space-time parallel algorithm is Chord (Comput Fluids 123:202-217, 2015), a computational fluid dynamics (CFD) application code for solving compressible fluid dynamics problems. For the same solution accuracy, speedups are demonstrated from the use of space-time parallelization over the time-sequential integration on Couette flow and Stokes' second problem. On a transient Couette flow case, at least a 1.5× speedup is achieved, and with a time periodic problem, a speedup of up to 13.7× over the time-sequential case is obtained. In both cases, the speedup is achieved by adding processors and exploring additional parallelization in time. The numerical experiments show the algorithm is promising for CFD applications that can take advantage of the time parallelism. Future work will focus on improving the parallel performance and providing more tests with complex fluid dynamics to demonstrate the full potential of the algorithm. Keywords Time-parallel • Mesh parallel-in-time • Adaptivity • Multigrid • MGRIT • High-order CFD • Finite-volume Communicated by Robert Speck.
Finite difference simulations of the Navier-Stokes equations using parallel distributed computing
Proceedings. 15th Symposium on Computer Architecture and High Performance Computing, 2003
This paper discusses the implementation of a numerical algorithm for simulating incompressible fluid flows based on the finite difference method and designed for parallel computing platforms with distributed-memory, particularly for clusters of workstations. The solution algorithm for the Navier-Stokes equations utilizes an explicit scheme for pressure and an implicit scheme for velocities, i. e., the velocity field at a new time step can be computed once the corresponding pressure is known. The parallel implementation is based on domain decomposition, where the original calculation domain is decomposed into several blocks, each of which given to a separate processing node. All nodes then execute computations in parallel, each node on its associated sub-domain. The parallel computations include initialization, coefficient generation, linear solution on the subdomain, and inter-node communication. The exchange of information across the sub-domains, or processors, is achieved using the message passing interface standard, MPI. The use of MPI ensures portability across different computing platforms ranging from massively parallel machines to clusters of workstations. The execution time and speed-up are evaluated through comparing the performance of different numbers of processors. The results indicate that the parallel code can significantly improve prediction capability and efficiency for large-scale simulations.
SIAM Journal on Scientific Computing, 2012
In this paper we describe a general adaptive finite element framework for unstructured tetrahedral meshes without hanging nodes suitable for large scale parallel computations. Our framework is designed to scale linearly to several thousands of processors, using fully distributed and efficient algorithms. The key components of our implementation, local mesh refinement and load balancing algorithms, are described in detail. Finally, we present a theoretical and experimental performance study of our framework, used in a large scale computational fluid dynamics computation, and we compare scaling and complexity of different algorithms on different massively parallel architectures.