From DFT to machine learning: recent approaches to materials science–a review (original) (raw)

A Deep Dive into Machine Learning Density Functional Theory for Materials Science and Chemistry

2021

With the growth of computational resources, the scope of electronic structure simulations has increased greatly. Artificial intelligence and robust data analysis hold the promise to accelerate large-scale simulations and their analysis to hitherto unattainable scales. Machine learning is a rapidly growing field for the processing of such complex datasets. It has recently gained traction in the domain of electronic structure simulations, where density functional theory takes the prominent role of the most widely used electronic structure method. Thus, DFT calculations represent one of the largest loads on academic high-performance computing systems across the world. Accelerating these with machine learning can reduce the resources required and enables simulations of larger systems. Hence, the combination of density functional theory and machine learning has the potential to rapidly advance electronic structure applications such as in-silico materials discovery and the search for new ...

Opportunities and Challenges for Machine Learning in Materials Science

Annual Review of Materials Research, 2020

Advances in machine learning have impacted myriad areas of materials science, such as the discovery of novel materials and the improvement of molecular simulations, with likely many more important developments to come. Given the rapid changes in this field, it is challenging to understand both the breadth of opportunities and the best practices for their use. In this review, we address aspects of both problems by providing an overview of the areas in which machine learning has recently had significant impact in materials science, and then we provide a more detailed discussion on determining the accuracy and domain of applicability of some common types of machine learning models. Finally, we discuss some opportunities and challenges for the materials community to fully utilize the capabilities of machine learning.

Dataset and scripts for A Deep Dive into Machine Learning Density Functional Theory for Materials Science and Chemistry

2021

This dataset contains additional data for the publication "A Deep Dive into Machine Learning Density Functional Theory for Materials Science and Chemistry". Its goal is to enable interested people to reproduce the citation analysis carried out in the aforementioned publication. <strong>Prerequesites</strong> The following software versions were used for the python version of this dataset: Python: 3.8.6 Scholarly: 1.2.0 Pyzotero: 1.4.24 Numpy: 1.20.1 <strong>Contents</strong> results/ : Contains the .csv files that were the results of the citation analysis. Paper groupings follow the ones outlined in the publication. scripts/ : Contains scripts to perform the citation analysis. Zotero.cached.pkl : Contains the cached Zotero library. <strong>Usage</strong> In order to reproduce the results of the citation analysis, you can use citation_analysis.py in conjunction with cached Zotero library. Manual additions can be verified using the check_c...

Using Machine Learning to Find New Density Functionals

ArXiv, 2021

Machine learning has now become an integral part of research and innovation. The field of machine learning density functional theory has continuously expanded over the years while making several noticeable advances. We briefly discuss the status of this field and point out some current and future challenges. We also talk about how state-of-the-art science and technology tools can help overcome these challenges. This draft is a part of the ”Roadmap on Machine Learning in Electronic Structure” to be published in Electronic Structure (EST).

Materials science in the artificial intelligence age: high-throughput library generation, machine learning, and a pathway from correlations to the underpinning physics

MRS Communications

The use of advanced data analytics and applications of statistical and machine learning approaches ('AI') to materials science is experiencing explosive growth recently. In this prospective, we review recent work focusing on generation and application of libraries from both experiment and theoretical tools, across length scales. The available library data both enables classical correlative machine learning, and also opens the pathway for exploration of underlying causative physical behaviors. We highlight the key advances facilitated by this approach, and illustrate how modeling, macroscopic experiments and atomic-scale imaging can be combined to dramatically accelerate understanding and development of new material systems via a statistical physics framework. These developments point towards a data driven future wherein knowledge can be aggregated and used collectively, accelerating the advancement of materials science.

Evolving the Materials Genome: How Machine Learning Is Fueling the Next Generation of Materials Discovery

Annual Review of Materials Research

Machine learning, applied to chemical and materials data, is transforming the field of materials discovery and design, yet significant work is still required to fully take advantage of machine learning algorithms, tools, and methods. Here, we review the accomplishments to date of the community and assess the maturity of state-of-the-art, data-intensive research activities that combine perspectives from materials science and chemistry. We focus on three major themes—learning to see, learning to estimate, and learning to search materials—to show how advanced computational learning technologies are rapidly and successfully used to solve materials and chemistry problems. Additionally, we discuss a clear path toward a future where data-driven approaches to materials discovery and design are standard practice.

An Investigation of Machine Learning Methods Applied to Structure Prediction in Condensed Matter

Materials characterization remains a significant, time-consuming undertaking. Generally speaking, spectroscopic techniques are used in conjunction with empirical and ab-initio calculations in order to elucidate structure. These experimental and computational methods typically require significant human input and interpretation, particularly with regards to novel materials. Recently, the application of data mining and machine learning to problems in material science have shown great promise in reducing this overhead [1]. In the work presented here, several aspects of machine learning are explored with regards to characterizing a model material, titania, using solid state Nuclear Magnetic Resonance (NMR). Specifically, a large dataset is generated, corresponding to NMR 47 Ti spectra, using ab-initio calculations for generated TiO 2 structures. Principal Components Analysis (PCA) reveals that input spectra may be compressed by more than 90%, before being used for subsequent machine learning. Two key methods are used to learn the complex mapping between structural details and input NMR spectra, demonstrating excellent accuracy when presented with test sample spectra. This work compares Support Vector Regression (SVR) and Artificial Neural Networks (ANNs), as one step towards the construction of an expert system for solid state materials characterization.

Machine learning with force-field-inspired descriptors for materials: Fast screening and mapping energy landscape

Physical Review Materials

We present a complete set of chemo-structural descriptors to significantly extend the applicability of machine-learning (ML) in material screening and mapping energy landscape for multicomponent systems. These new descriptors allow differentiating between structural prototypes, which is not possible using the commonly used chemical-only descriptors. Specifically, we demonstrate that the combination of pairwise radial, nearest neighbor, bond-angle, dihedral-angle and core-charge distributions plays an important role in predicting formation energies, bandgaps, static refractive indices, magnetic properties, and modulus of elasticity for three-dimensional (3D) materials as well as exfoliation energies of two-dimensional (2D) layered materials. The training data consists of 24549 bulk and 616 monolayer materials taken from JARVIS-DFT database. We obtained very accurate ML models using gradient boosting algorithm. Then we use the trained models to discover exfoliable 2D-layered materials satisfying specific property requirements. Additionally, we integrate our formation energy ML model with a genetic algorithm for structure search to verify if the ML model reproduces the DFT convex hull. This verification establishes a more stringent evaluation metric for the ML model than what commonly used in data sciences. Our learnt model is publicly available on the JARVIS-ML website (https://www.ctcms.nist.gov/jarvisml) property predictions of generalized materials.

Efficient Prediction of Structural and Electronic Properties of Hybrid 2D Materials Using Complementary DFT and Machine Learning Approaches

Advanced Theory and Simulations, 2018

There are now, in principle, a limitless number of hybrid van der Waals (vdW) heterostructures that can be built from the rapidly growing number of 2D layers. The key question is how to explore this vast parameter space in a practical way. Computational methods can guide experimental work. However, even the most efficient electronic structure methods such as density functional theory, are too time consuming to explore more than a tiny fraction of all possible hybrid 2D materials. A combination of density functional theory (DFT) and machine learning techniques provide a practical method for exploring this parameter space much more efficiently than by DFT or experiments. As a proof of concept, this methodology is applied to predict the interlayer distance and band gap of bilayer heterostructures. The methods quickly and accurately predict these important properties for a large number of hybrid 2D materials. This work paves the way for rapid computational screening of the vast paramete...

Machine learning based prediction of the electronic structure of quasi-one-dimensional materials under strain

Physical Review B

We present a machine learning based model that can predict the electronic structure of quasione-dimensional materials while they are subjected to deformation modes such as torsion and extension/compression. The technique described here applies to important classes of materials systems such as nanotubes, nanoribbons, nanowires, miscellaneous chiral structures and nanoassemblies, for all of which, tuning the interplay of mechanical deformations and electronic fields-i.e., strain engineering-is an active area of investigation in the literature. Our model incorporates global structural symmetries and atomic relaxation effects, benefits from the use of helical coordinates to specify the electronic fields, and makes use of a specialized data generation process that solves the symmetry-adapted equations of Kohn-Sham Density Functional Theory in these coordinates. Using armchair single wall carbon nanotubes as a prototypical example, we demonstrate the use of the model to predict the fields associated with the ground state electron density and the nuclear pseudocharges, when three parameters-namely, the radius of the nanotube, its axial stretch, and the twist per unit length-are specified as inputs. Other electronic properties of interest, including the ground state electronic free energy, can be evaluated from these predicted fields with low-overhead post-processing, typically to chemical accuracy. Additionally, we show how the nuclear coordinates can be reliably determined from the predicted pseudocharge field using a clustering based technique. Remarkably, only about 120 data points are found to be enough to predict the three dimensional electronic fields accurately, which we ascribe to the constraints imposed by symmetry in the problem setup, the use of low-discrepancy sequences for sampling, and efficient representation of the intrinsic low-dimensional features of the electronic fields. We comment on the interpretability of our machine learning model and anticipate that our framework will find utility in the automated discovery of low-dimensional materials, as well as the multi-scale modeling of such systems.