A survey of parallel nonlinear dynamic analysis methodologies (original) (raw)
Related papers
Advances in Engineering Software, 2006
This paper presents parallel computational strategies to implement explicit nonlinear finite element analysis code onto distributed memory parallel computers for solving large-scale problems in structural dynamics. Implementation details on both homogeneous and heterogeneous parallel processing environments are considered in detail in this paper. Implementation of an explicit nonlinear finite element dynamic analysis code on homogeneous systems is discussed first and this is later moved onto heterogeneous systems. Domain decomposition with explicit message passing is preferred for parallel implementation. The message passing implementation in the parallel algorithm is based on MPI (Message Passing Interface) libraries. Implementation aspects of overlapped, non-overlapped domain decomposition techniques, Dynamic Task Allocation (DTA) and clustering techniques for DTA and their relative merits are presented. The interprocessor communications are optimised by overlapping with computations to improve the performance of the domain decomposition based explicit dynamic analysis finite element code.
1992
: This research is directed toward the numerical analysis of large, three dimensional, nonlinear dynamic problems in structural and solid mechanics. Such problems include those exhibiting large deformations, displacements, or rotations, those requiring finite strain plasticity material models that couple geometric and material nonlinearities, and those demanding detailed geometric modeling. A finite element code was developed, designed around the 3D isoparametric family of elements, and using a Total Lagrangian formulation and implicit integration of the global equations of motion. The research was conducted using the Alliant FX/8 and Convex C240 supercomputers. The research focuses on four main areas: Development of element computation algorithms that exploit the inherent opportunities for concurrency and vectorization present in the finite element method; Comparison of the preconditioned conjugate gradient method to a representative direct solver; Investigation of various nonlinea...
Sadhana, 2004
The work reported in this paper is motivated by the need to develop portable parallel processing algorithms and codes which can run on a variety of hardware platforms without any modifications. The prime aim of the research work reported here is to test the portability of the parallel algorithms and also to study and understand the comparative efficiencies of three parallel algorithms developed for implicit time integration technique. The standard message passing interface (MPI) is used to develop parallel algorithms for computing nonlinear dynamic response of large structures employing implicit time-marching scheme. The parallel algorithms presented in this paper are developed under the broad framework of non-overlapped domain decomposition technique. Numerical studies indicate that the parallel algorithm devised employing the conventional form of Newmark time integration algorithm is faster than the predictor-corrector form. It is also accurate and highly adaptive to fine grain computations. The group implicit algorithm is found to be extremely superior in performance when compared to the other two parallel algorithms. This algorithm is better suited for large size problems on coarse grain environment as the resulting submeshes will obviously be large and thus permit larger time steps without losing accuracy.
A parallel mixed time integration algorithm for nonlinear dynamic analysis
Advances in Engineering Software, 2002
This paper presents a parallel mixed time integration algorithm formulated by synthesising the implicit and explicit time integration techniques. The proposed algorithm is an extension of the mixed time integration algorithms [Comput. Meth. Appl. Mech. Engng 17/18 (1979) 259; Int. J. Numer. Meth. Engng 12 (1978) 1575] being successfully employed for solving media-structure interaction problems. The parallel algorithm for nonlinear dynamic response of structures employing mixed time integration technique has been devised within the broad framework of domain decomposition. Concurrency is introduced into this algorithm, by integrating interface nodes with explicit time integration technique and later solving the local submeshes with implicit algorithm. A flexible parallel data structure has been devised to implement the parallel mixed time integration algorithm. Parallel finite element code has been developed using portable Message Passing Interface software development environment. Numerical studies have been conducted on PARAM-10000 (Indian parallel supercomputer) to test the accuracy and also the performance of the proposed algorithm. Numerical studies indicate that the proposed algorithm is highly adaptive for parallel processing. q
Parallel computation of meshless methods for explicit dynamic analysis
International Journal for Numerical Methods in Engineering, 2000
A parallel computational implementation of modern meshless methods is presented for explicit dynamic analysis. The procedures are demonstrated by application of the Reproducing Kernel Particle Method (RKPM). Aspects of a coarse grain parallel paradigm are detailed for a Lagrangian formulation using model partitioning. Integration points are uniquely de"ned on separate processors and particle de"nitions are duplicated, as necessary, so that all support particles for each point are de"ned locally on the corresponding processor. Several partitioning schemes are considered and a reduced graph-based procedure is presented. Partitioning issues are discussed and procedures to accommodate essential boundary conditions in parallel are presented. Explicit MPI message passing statements are used for all communications among partitions on di!erent processors. The e!ectiveness of the procedure is demonstrated by highly deformable inelastic example problems. Figure 5. Shared and unshared nodes for the partitioned RKPM model in Figure 4.
Nonlinear finite element problems on parallel computers
Lecture Notes in Computer Science, 1994
VECFEM is a black-box solver for the solution of a large class of nonlinear functional equations by nite element methods. It uses very robust solution methods for the linear FEM problem to compute reliably the Newton-Raphson correction and the error indicator. Kernel algorithms are conjugate gradient methods (CG) for the solution of the linear system. In this paper we present the optimal data structures on parallel computers for the matrix-vector multiplication, which is the key operation in the CG iteration, the principles of the element distribution onto the processors and the mounting of the global matrix over all processors as transformation of optimal data structures. VECFEM is portably implemented for message passing systems. Two examples with unstructured and structured grids will show the e ciency of the data structures.
Linear and nonlinear finite element analysis on multiprocessor computer systems
Communications in Applied Numerical Methods, 1988
Several general purpose computer systems with multiple processors operating concurrently are currently being commercially produced. Most of the present generation of finite element software was not designed to take advantage of this new technology. The purpose of this paper is to present the advantages of a new architecture for finite element programs which will operate efficiently on computer systems with any number of processors. Also, the basic approach is effective for multiprocessor computers with local or shared memory. The new computer program architecture is based on an initial application of a simple algorithm which automatically subdivides the complete finite element domain into a number of subdomains equal to the number of available processors. The resulting data structure requires a minimum of communication between processors during the formation of basic element matrices, reduction of subdomain matrices and the postprocessing of element results. The assembly and solution of the global system of subdomains can be accomplished directly or iteratively, and new concurrent solution algorithms can be introduced. Several of the ideas presented here have been tested on various types of computer. The new program architecture indicates that it is possible to obtain speed-up times of over 90 percent of the maximum theoretical values if appropriate numerical methods are employed.
2009 3rd Southern Conference on Computational Modeling, 2009
This paper presents a parallel implementation of the implicitly restarted Lanczos method for the solution of large and sparse eigenproblems that occur in modal analysis of complex structures using the finite element method. The implicitly restarted technique improves convergence of the desired eigenvalues without the penalty of lost of orthogonality keeping the number of factorization steps in a modest size. In the parallel solution, a subdomain by subdomain approach was implemented and overlapping and non-overlapping mesh partitions were used. Compressed data structures in the formats CSRC and CSRC/CSR were employed to store the global matrices coefficients. The parallelization of numerical linear algebra operations presented in both Krylov and implicitly restarted methods are discussed.