Early experiences and results on parallelizing discrete dislocation dynamics simulations on multi-core architectures (original) (raw)
Related papers
Procedia Computer Science, 2010
Materials science simulations are among the leading applications for scientific supercomputing. Discrete dislocation dynamics (DDD) is a numerical tool used to model the plastic behavior of crystalline materials using the elastic theory of dislocations. DDD simulations require very long running times to produce meaningful scientific results. This paper presents early experiences and results on improving the running time of Micromegas, an application code for three-dimensional DDD simulations. We used open source profiling and tracing tools to analyze the behavior and performance, as well as to identify the performance bottlenecks of Micromegas. The major performance bottleneck of Micromegas, amounts to 68% of the total sequential run time and is parallelized using OpenMP. Evaluation and validation tests conducted on a Nehalem quad-core processor show 50% improvement in the simulation time for 3-D DDD over 100,000 time steps. The correctness of the scientific data produced by the parallel Micromegas are successfully validated against those of the serial version.
Early experiences and results on parallelizing discrete dislocation dynamic code
Materials science simulations are among the leading applications for scientific supercomputing. Discrete dislocation dynamics (DDD) is a numerical tool used to model the plastic behavior of crystalline materials using the elastic theory of dislocations. DDD simulations require very long running times to produce meaningful scientific results. This work presents early experiences and results on improving the running time of Micromegas, an application code for three-dimensional DDD simulations. We used open source profiling and tracing tools to analyze the behavior and performance, as well as to identify the performance bottlenecks of Micromegas. The major performance bottleneck of Micromegas, amounts to ∼73% of the total sequential run time and is parallelized using OpenMP. Evaluation and validation tests conducted on a Nehalem quad-core processor show ∼50% improvement in the simulation time for 3-D DDD over 30,000 time steps. The correctness and accuracy of the scientific data produced by the parallel Micromegas are successfully validated against those of the original version.
Numerical methods to improve the computing efficiency of discrete dislocation dynamics simulations
Journal of Computational Physics, 2006
Dislocation dynamics (DD) is a method to simulate the collective dynamic behavior of dislocations and the plasticity of metals on a mesoscopic scale. A DD simulation is computationally demanding due to the fact that the stress field of a dislocation segment is long-ranged and it needs to examine a possible intersection between dislocation segments during their motion. The computing efficiency of a serial DD code is enhanced by using the so-called Ôbox methodÕ. The box method employing 21 3 boxes achieves 30-fold speed ups in the case involving 20,000 segments. The modified serial DD code has then been parallelized by using the standard message passing interface (MPI). Both the stress computation and handling segment intersection have been parallelized by using the domain decomposition method. Performance test on IBM p690 architecture shows that the parallel scheme adds up 20-fold speed ups when using 36 processors. Thus the parallel DD code presented here is about 600 times faster than the previous code. We present a parallel algorithm for highly complex dependencies in handling segment intersections and the performance test results in detail.
A parallel algorithm for 3D dislocation dynamics
Journal of Computational Physics, 2006
Dislocation dynamics (DD), a discrete dynamic simulation method in which dislocations are the fundamental entities, is a powerful tool for investigation of plasticity, deformation and fracture of materials at the micron length scale. However, severe computational difficulties arising from complex, long-range interactions between these curvilinear line defects limit the application of DD in the study of large-scale plastic deformation. We present here the development of a parallel algorithm for accelerated computer simulations of DD. By representing dislocations as a 3D set of dislocation particles, we show here that the problem of an interacting ensemble of dislocations can be converted to a problem of a particle ensemble, interacting with a long-range force field. A grid using binary space partitioning is constructed to keep track of node connectivity across domains. We demonstrate the computational efficiency of the parallel micro-plasticity code and discuss how O(N) methods map naturally onto the parallel data structure. Finally, we present results from applications of the parallel code to deformation in single crystal fcc metals.
High Performance Computing in Science and Engineering '08
A parallel discrete dislocation dynamics tool is employed to study the size dependent plasticity of small metallic structures. The tool has been parallelised using OpenMP. An excellent overall scaling is observed for different loading scenarios. The size dependency of the plastic flow is confirmed by the performed simulations for uniaxial loading and micro-bending tests. The microstructural origin of the size effect is analysed. A strong influence of the initial microstructure on the statistics of the deformation behaviour is observed, for both the uniaxial and bending scenario.
Early experiences and results on parallelizing discrete
Materials science simulations are among the leading applications for scientific supercomputing. Discrete dislocation dynamics (DDD) is a numerical tool used to model the plastic behavior of crystalline materials using the elastic theory of dislocations. DDD simulations require very long running times to produce meaningful scientific results. This work presents early experiences and results on improving the running time of Micromegas, an application code for three-dimensional DDD simulations. We used open source profiling and tracing tools to analyze the behavior and performance, as well as to identify the performance bottlenecks of Micromegas. The major performance bottleneck of Micromegas, amounts to ∼73% of the total sequential run time and is parallelized using OpenMP. Evaluation and validation tests conducted on a Nehalem quad-core processor show ∼50% improvement in the simulation time for 3-D DDD over 30,000 time steps. The correctness and accuracy of the scientific data produced by the parallel Micromegas are successfully validated against those of the original version.
Large-Scale 3D Phase Field Dislocation Dynamics Simulations On High-Performance Architectures
International Journal of High Performance Computing Applications, 2011
In this paper we present the development and performance of a three-dimensional phase field dislocation dynamics (3D PFDD) model for large-scale dislocation-mediated plastic deformation on high-performance architectures. Through the parallelization of this algorithm, efficient run times can be achieved for large-scale simulations. The algorithm's performance is analyzed over several computing platforms including Infiniband, GigE, and proprietary (SiCortex) interconnects. Scalability is considered on data sets up to 2,048 3 , along with the efficiency on up to 2,048 processors. Results show that scalability improves as the size of the data set increases and that the overall performance is best on the Infiniband interconnect. In addition, a performance model has been developed to predict run times and efficiency on large sets of data running on multiple processors. This performance analysis shows that this parallel code is capable of harnessing the greater computer power available from petascale systems.
Accelerating force calculation for dislocation dynamics simulations
arXiv (Cornell University), 2023
Discrete dislocation dynamics (DDD) simulations offer valuable insights into the plastic deformation and workhardening behavior of metals by explicitly modeling the evolution of dislocation lines under stress. However, the computational cost associated with calculating forces due to the long-range elastic interactions between dislocation segment pairs is one of the main causes that limit the achievable strain levels in DDD simulations. These elastic interaction forces can be obtained either from the integral of the stress field due to one segment over the other segment, or from the derivatives of the elastic interaction energy. In both cases, the results involve a double-integral over the two interacting segments. Currently, existing DDD simulations employ the stress-based approach with both integrals evaluated either from analytical expressions or from numerical quadrature. In this study, we systematically analyze the accuracy and computational cost of the stress-based and energy-based approaches with different ways of evaluating the integrals. We find that the stress-based approach is more efficient than the energy-based approach. Furthermore, the stress-based approach becomes most cost-effective when one integral is evaluated from analytic expression and the other integral from numerical quadrature. For well-separated segment pairs whose center distances are more than three times their lengths, this one-analytic-integral and one-numerical-integral approach is more than three times faster than the fully analytic approach, while the relative error in the forces is less than 10 −3. Because the vast majority of segment pairs in a typical simulation cell are well-separated, we expect the hybrid analytic/numerical approach to significantly boost the numerical efficiency of DDD simulations of work hardening.
A Monte Carlo method for simulating dislocation microstructures in three dimensions
Computational Materials Science, 2001
We present an energy-based Monte Carlo (MC) approach for modeling discrete dislocation microstructures in three dimensions. We discuss the energetics of dislocation interactions and the various discretization procedures for the dislocation loops. The choice of MC trial moves for the dislocation segments is presented and discussed. We show results from some relatively simple trial calculations, showing the convergence of the calculations to known results as well as the¯exibility of the MC approach. Comparisons with other types of dislocation simulations are made, indicating the range of suitability for the present method.