Developing Machine-Learned Potentials for Coarse-Grained Molecular Simulations: Challenges and Pitfalls (original) (raw)

Challenges for machine learning force fields in reproducing potential energy surfaces of flexible molecules

The Journal of Chemical Physics, 2021

Dynamics of flexible molecules are often determined by an interplay between local chemical bond fluctuations and conformational changes driven by long-range electrostatics and van der Waals interactions. This interplay between interactions yields complex potential-energy surfaces (PES) with multiple minima and transition paths between them. In this work, we assess the performance of state-of-the-art Machine Learning (ML) models, namely sGDML, SchNet, GAP/SOAP, and BPNN for reproducing such PES, while using limited amounts of reference data. As a benchmark, we use the cis to trans thermal relaxation in an azobenzene molecule, where at least three different transition mechanisms should be considered. Although GAP/SOAP, SchNet, and sGDML models can globally achieve chemical accuracy of 1 kcal mol-1 with fewer than 1000 training points, predictions greatly depend on the ML method used as well as the local region of the PES being sampled. Within a given ML method, large differences can be found between predictions of close-to-equilibrium and transition regions, as well as for different transition mechanisms. We identify key challenges that the ML models face in learning long-range interactions and the intrinsic limitations of commonly used atom-based descriptors. All in all, our results suggest switching from learning the entire PES within a single model to using multiple local models with optimized descriptors, training sets, and architectures for different parts of complex PES.

A GPU-Accelerated Machine Learning Framework for Molecular Simulation: HOOMD-blue with TensorFlow

As interest grows in applying machine learning force-fields and methods to molecular simulation, there is a need for state-of-the-art inference methods to use trained models within efficient molecular simulation engines. We have designed and implemented software that enables integration of a scalable GPU-accelerated molecular mechanics engine, HOOMD-blue, with the machine learning (ML) TensorFlow package. TensorFlow is a GPU-accelerated, scalable, graph-based tensor computation model building package that has been the implementation of many recent innovations in deep learning and other ML tasks. TensorFlow models are constructed in Python and can be visualized or debugged using the rich set of tools implemented in the TensorFlow package. In this article, we present four major examples of tasks this software can accomplish which would normally require multiple different tools: (1) we train a neural network to reproduce a force field of a Lennard-Jones simulation; (2) we perform onlin...

Machine Learning Force Fields and Coarse-Grained Variables in Molecular Dynamics: Application to Materials and Biological Systems

Journal of Chemical Theory and Computation, 2020

Machine learning encompasses a set of tools and algorithms which are now becoming popular in almost all scientific and technological fields. This is true for molecular dynamics as well, where machine learning offers promises of extracting valuable information from the enormous amounts of data generated by simulation of complex systems. We provide here a review of our current understanding of goals, benefits, and limitations of machine learning techniques for computational studies on atomistic systems, focusing on the construction of empirical force fields from ab-initio databases and the determination of reaction coordinates for free energy computation and enhanced sampling.

Choosing the right molecular machine learning potential

Chemical Science

Quantum-chemistry simulations based on potential energy surfaces of molecules provide invaluable insight into the physicochemical processes at the atomistic level and yield such important observables as reaction rates and spectra....