Toward high‐performance computational chemistry: II. A scalable self‐consistent field program (original) (raw)

High-performance computing in chemistry: NW Chem

Future Generation Computer Systems, 1996

Over the last three decades the methods of quantum chemistry have shown an impressive development: a large number of reliable and efficient approximations to the solution of the non-relativistic Schrödinger and the relativistic Dirac equation, respectively, are available. This is complemented by the availability of a number of well-developed computer programs which allow of the treatment of chemical problems as a matter of routine. This progress has been acknowledged by the Nobel prize in chemistry 1998 to John Pople and Walter Kohn for the development of quantum chemical methods.

Parallelism in computational chemistry

Theor Chem Acc, 1993

An account is given of experience gained in implementing computational chemistry application software, including quantum chemistry and macromolecular refinement codes, on distributed memory parallel processors. In quantum chemistry we consider the coarse-grained implementation of Gaussian integral and derivative integral evaluation, the direct-SCF computation of an uncorrelated wavefunction,~the 4-index transformation of two-electron integrals and the direct-CI calculation of correlated wavefunctions. In the refinement of macromolecular conformations, we describe domain decomposition techniques used in implementing general purpose molecular mechanics, molecular dynamics and free energy perturbation calculations. Attention is focused on performance figures obtained on the Intel iPSC/2 and iPSC/860 hypercubes, which are compared with those obtained on a Cray Y-MP/464 and Convex C-220 minisupercomputer. From this data we deduce the cost effectiveness of parallel processors in the field of computational chemistry.

Algorithms vs. architectures for computational chemistry

1990

The algorithms employed are computationally intensive and, as a result, increased performance (both algorithmic and architectural) is required to improve accuracy and to treat larger molecular systems. Several benchmark quantum chemistry codes are examined on a variety of architectures. While these codes are only a small portion of a typical quantum chemistry library, they illustrate many of the computationally intensive kernels and data manipulation requirements of some applications. Furthermore, understanding the performance of the existing algorithm on present and proposed supercomputers serves as a guide for future programs and algorithm development. The algorithms investigated are: (1) a sparse symmetric matrix vector product; (2) a four index integral transformation; and (3) the calculation of diatomic two electron Slater integrals. The vectorization strategies are examined for these algorithms for both the Cyber 205 and Cray XMP. In addition, multiprocessor implementations of...

Program package MP-AM1 for parallel quantum-chemical computing in the sp-basis

A parallel realization of the NDDO-WF technique for semi-empirical quantum-chemical calculations on large molecular systems in the spd-basis is described. The technological aspects of designing scalable parallel calculations on super computers (by using MPI library) are discussed. The scaling of individual algorithms and entire package was carried out for two model systems with a number of atomic orbitals of 894 and 2014, respectively. The speedup was determined in computer experiments with the RM600 E60 and Cluster Intel PIII multi-processor systems. The effect of communication rate on the package performance is discussed.

Towards Automatic Synthesis of High-Performance Codes for Electronic Structure Calculations: Data Locality Optimization

Lecture Notes in Computer Science, 2001

The goal of our project is the development of a program synthesis system to facilitate the development of high-performance parallel programs for a class of computations encountered in computational chemistry and computational physics. These computations are expressible as a set of tensor contractions and arise in electronic structure calculations. This paper provides an overview of a planned synthesis system that will take as input a high-level specification of the computation and generate high-performance parallel code for a number of target architectures. We focus on an approach to performing data locality optimization in this context. Preliminary experimental results on an SGI Origin 2000 are encouraging and demonstrate that the approach is effective.

Design and performance characterization of electronic structure calculations on massively parallel supercomputers: a case study of GPAW on the Blue Gene/P architecture

Concurrency and Computation: Practice and Experience, 2013

Density function theory (DFT) is the most widely employed electronic structure method due to its favorable scaling with system size and accuracy for a broad range of molecular and condensed-phase systems. The advent of massively parallel supercomputers has enhanced the scientific community's ability to study larger system sizes. Ground-state DFT calculations on ∼10 3 valence electrons using traditional O(N 3 ) algorithms can be routinely performed on present-day supercomputers. The performance characteristics of these massively parallel DFT codes on >10 4 computer cores are not well understood. The GPAW code was ported an optimized for the Blue Gene/P architecture. We present our algorithmic parallelization strategy and interpret the results for a number of benchmark test cases.

Parallel calculations of molecular properties

Computer Physics Communications, 2000

We discuss aspects of the parallelization of the Dalton quantum chemistry program, with particular emphasis on the calculation of second-and higher-order properties for large molecules. Our treatment includes real and imaginary perturbations, both frequency-dependent and static. The scaling behaviour of our approach, which is rather coarse-grained, is examined on different parallel platforms, including the Cray-T3E and an IBM SP with the latest multiprocessor nodes. The excellent scaling behaviour on the latter is especially significant given that the first TFLOPS computer available to the US academic community will be built from these nodes and deployed here at San Diego Supercomputer Center before the end of 1999. We then discuss applications of the code to several areas of interest in chemical physics.

Linear scaling computation of the Fock matrix. VII. Parallel computation of the Coulomb matrix

The Journal of Chemical Physics, 2004

Linear scaling quantum chemical methods for Density Functional Theory are extended to the condensed phase at the Γ-point. For the two-electron Coulomb matrix, this is achieved with a tree-code algorithm for fast Coulomb summation [J. Chem. Phys. 106, 5526 (1997)], together with multipole representation of the crystal field [J. Chem. Phys. 107, 10131 (1997)]. A periodic version of the hierarchical cubature algorithm [J. Chem. Phys. 113, 10037 (2000)], which builds a telescoping adaptive grid for numerical integration of the exchange-correlation matrix, is shown to be efficient when the problem is posed as integration over the unit cell. Commonalities between the Coulomb and exchange-correlation algorithms are discussed, with an emphasis on achieving linear scaling through the use of modern data structures. With these developments, convergence of the Γ-point supercell approximation to the k-space integration limit is demonstrated for MgO and NaCl. Linear scaling construction of the Fockian and control of error is demonstrated for RBLYP/6-21G* diamond up to 512 atoms.