Boris Fackovec | University of Cambridge (original) (raw)
Uploads
Papers by Boris Fackovec
Octave scripts for easier reproduction of our results for 2D and 3D clusters of six Morse particl... more Octave scripts for easier reproduction of our results for 2D and 3D clusters of six Morse particles. Contact Boris Fackovec with questions about script usage or regarding input and output files. Another release containing detailed instructions may be provided depending on demand.
The Journal of Chemical Physics, 2015
A method is derived to coarse-grain the dynamics of complex molecular systems to a Markov jump pr... more A method is derived to coarse-grain the dynamics of complex molecular systems to a Markov jump process (MJP) describing how the system jumps between cells that fully partition its state space. The main inputs are relaxation times for each pair of cells, which are shown to be robust with respect to positioning of the cell boundaries. These relaxation times can be calculated via molecular dynamics simulations performed in each cell separately and are used in an efficient estimator for the rate matrix of the MJP. The method is illustrated through applications to Sinai billiards and a cluster of Lennard-Jones discs.
A method is derived to coarse-grain the dynamics of complex molecular systems to a Markov jump pr... more A method is derived to coarse-grain the dynamics of complex molecular systems to a Markov jump process (MJP) describing how the system jumps between cells that fully partition its state space. The main inputs are relaxation times for each pair of cells, which are shown to be robust with respect to positioning of the cell boundaries. These relaxation times can be calculated via molecular dynamics simulations performed in each cell separately and are used in an efficient estimator for the rate matrix of the MJP. The method is illustrated through applications to Sinai billiards and a cluster of Lennard-Jones discs.
This work presents a new method for calculating rate constants for configurational transitions de... more This work presents a new method for calculating rate constants for configurational transitions described in terms of a master equation. The method is based on con- straining molecular dynamics simulations to boxes in configuration space, and is also known as “boxed molecular dynamics”. Rate constants can be easily calculated even for systems deviating from an exponential distribution of the first passage times, as a result of the presence of internal barriers and roughness of the dividing surfaces. The theoretical justification of the method is based on the concept of mean first passage times. One of the assumptions of the reactive flux formulation is omitted; regression of the population evolution is used instead of calculation of the rate constant at a single point, so the distribution of first passage times is not required to be strictly exponential. The efficiency and correctness of the new method, boxed molecular dynamics in the first passage time formulation of the rate constants (FPT-BXD), is demonstrated for toy models. Preliminary results of simulations for a cluster of Lennard-Jones discs using FPT-BXD are discussed.
The Journal of Physical Chemistry B, 2012
Although a contact is an essential measurement for the topology as well as strength of non-covale... more Although a contact is an essential measurement for the topology as well as strength of non-covalent interactions in biomolecules and their complexes, there is no general agreement in the definition of this feature. Most of the definitions work with simple geometric criteria which do not fully reflect the energy content or ability of the biomolecular building blocks to arrange their environment. We offer a reasonable solution to this problem by distinguishing between "productive" and "non-productive" contacts based on their interaction energy strength and properties. We have proposed a method which converts the protein topology into a contact map that represents interactions with statistically significant high interaction energies. We do not prove that these contacts are exclusively stabilizing, but they represent a gateway to thermodynamically important rather than geometry-based contacts. The process is based on protein fragmentation and calculation of interaction energies using the OPLS force field and relies on pairwise additivity of amino acid interactions. Our approach integrates the treatment of different types of interactions, avoiding the problems resulting from different contributions to the overall stability and the different effect of the environment. The first applications on a set of homologous proteins have shown the usefulness of this classification for a sound estimate of protein stability.
Complex reaction networks (CRN) prove to be excellent models for complex chemical systems like ch... more Complex reaction networks (CRN) prove to be excellent models for complex chemical systems like chemical reactors or cellular compartments. Dynamics of the modeled sys- tem can be studied qualitatively using graph theoretical methods and convex analysis. In this work, methodology for qualitative analysis of CRN is developed and imple- mented in MATLAB/Octave. Emphasis is put on comfort of the user and developer, i.e. on straightforward usage and lucidity of the code. Program for decomposition of a network into extreme pathways and for determination of their stability is implemented based on literature. Algorithm for automatic classification of potential oscillators is invented and an efficient genetic algorithm for finding Hopf bifurcation is proposed. The developed software is used to analyze 5 representative relevant CRN models.
Folding free energy of a protein is a delicate balance between stabi- lizing and destabilizing no... more Folding free energy of a protein is a delicate balance between stabi- lizing and destabilizing non-covalent itneractions. In this work, we decompose folding free energy into physically meaningful contributions, in which we aim to find general trends. Empirical potential is used to calculate interaction ener- gy between all protein fragments, which are classified based on their dominant term in multipolar expansion. Calculations are done using 1200 non-redundant structures from PDB database. Based on general trends found in interactions between these fragments, we attempt to better understand relationships between interaction energies calculated using computational chemistry methods and their corresponding free energy contributions on stabilization.
An amino-acid in proteins shows two different, yet mutually dependent faces connected through the... more An amino-acid in proteins shows two different, yet mutually dependent faces connected through the polymer character of a protein in the final product. They are the amino-acid sidechain and its corresponding backbone part. On the level of the side-chains, we often refer to specific structural arrangements such as hydrophobic cluster motifs, salt-bridge motifs or hydrogen-bond motifs characterizing various parts of a protein and usually assigned to a certain function. The backbone on the other hand offers limited, yet general structural motifs - and random coil patterns. All of these mentioned amino-acid features contribute to the synergy demonstrated observably by protein stability and protein function. Thermal stability is one of the most important features of the structure of a fully folded protein. It is defined as the difference in the Gibbs free energy between its native and denaturated states and as such is a function of temperature and implicitly a function of protein composition and the effect of the environment. Nevertheless, it is necessary to say that for this function we do not know yet the precise and general form which could be applicable for a large set of proteins. There have been many attempts to propose an intuitive, yet productive decomposition of Gibbs free stabilization energy (GFSE) into simple terms. One of the scenarios utilized for such purposes is that the total free energy is the sum of the free energies of various atomic groups and the hydrophobic effect. However, as the free energy is not additive and the fractionation of free energy to independent terms is difficult, this attempt has been quite unsuccessful. The utilization of molecular modeling methodology and tools has opened a more systematic and perhaps more promising approach -the evaluation of the enthalpy term in the equation for Gibbs free energy with reasonable accuracy . The remaining entropy term could be obtained by fitting the corresponding analytical form to the experimental data. There are basically three different enthalpy contributions that we can separate. The first comes from the intramolecular interactions between the atoms of proteins, producing the largest stabilizing enthalpy contribution. The second comes from the interactions between the molecules of a solvent, and finally the third contribution is the result of the interactions between the atoms of the solute (protein) and the solvent.
The program takes system of ordinary differential equations with one parameter and starting point... more The program takes system of ordinary differential equations with one parameter and starting point on input and calculates one curve of stationary solutions and its stability. The algorithm was proposed by Kubicek in 1976. Let N be number of differential equations and M be number of steps of continuation. The code is O(N 3 M) complex in big O notation. It stores whole trajectory in RAM, therefore it has O(M) memory consumption for long trajectories (in usual cases M > N 2 ) and O(N 2 ) for short ones.
Interaction energy matrix concept is a promising approach to study protein folding, binding and f... more Interaction energy matrix concept is a promising approach to study protein folding, binding and function. The concept of hydrophobic core provides valuable opportunity to unify thermodynamic, kinetic, evolutionary and structural points of view. Method guaranteeing transferable and objective identification of key residues can make possible their further investigations and is the main purpose of this work.
The thesis introduces statistical mechanics, molecular modeling and structural biology backgrounds essential for theoretical modeling of the protein folding. Graph representation of proteins in energy space is utilized to characterize energy proportions of sidechains and importance of particular contact types. New definition of contact is proposed and finally, established energy quantities are used to determine residue importance.
Thesis Chapters by Boris Fackovec
PhD thesis, 2016
Simulating the dynamics of rare events is a great challenge for current computational chemistry. ... more Simulating the dynamics of rare events is a great challenge for current computational chemistry. It is generally accepted that some dynamical properties cannot be properly studied without propagation of reactive trajectories, yet most current approaches are derived from transition state theory (TST), calculating rates from a static picture of the reactant and the transition state. Instead of pursuing the observable timescales directly, simulated trajectories are used to calculate a correction factor to TST, which is exact at short timescales. Particularly in soft matter science,
where viscosity of the solvent slows down the dynamics and barriers between observable states can be low, TST often fails to give even qualitative agreement with experiments. Methods for calculating the correction factors are often complicated and inefficient.
The present work introduces an alternative approach, focused on observable timescales. Dynamical coarse-graining is instead based on relaxation between extracted pairs of cells. Using certain sets of assumptions, the relaxation approach is applied to two essential issues in modeling complex systems:
1. A complex energy landscape can be mapped to a chemical master equation with arbitrary state definitions using discrete relaxation path sampling (DRPS, chapter 3).
2. The dimensionality of large kinetic transition networks can be efficiently reduced using relaxed recursive regrouping (RRR, chapter 4).
The effectiveness of the methods is illustrated for toy models and the applicability to systems of general interest is demonstrated for several models recently studied by our colleagues.
The present work is based on three articles published in peer-reviewed journals, or currently being in the peer-review process:
1. In ‘‘Markov state modeling and dynamical coarse-graining via discrete relaxation path sampling’’, we propose a method for coarse-graining molecular dynamics to a chemical master equation. The definition of rate coefficients (elements of the rate matrix) is based on the pairwise relaxation and thus is robust with respect to positioning of boundaries between the cells representing the states.
2. In ‘‘Dynamical properties of two- and three-dimensional colloidal clusters of six particles’’, we use discrete relaxation path sampling to study properties of a colloidal cluster. These clusters are important model systems for studying self-assembly, and have been extensively studied both theoretically and experimentally.
3. In ‘‘Extracting of rates from kinetic transition networks: relaxed recursive
regrouping’’, the pairwise relaxation approach is used to develop an efficient method for reduction of kinetic transition networks to correctly reproduce processes occurring at long timescales, respecting the global topology of the network.
The purpose of the present work is to explain the theoretical basis for the published works, and discuss the potential pitfalls, promising future directions, and potential applications.
Octave scripts for easier reproduction of our results for 2D and 3D clusters of six Morse particl... more Octave scripts for easier reproduction of our results for 2D and 3D clusters of six Morse particles. Contact Boris Fackovec with questions about script usage or regarding input and output files. Another release containing detailed instructions may be provided depending on demand.
The Journal of Chemical Physics, 2015
A method is derived to coarse-grain the dynamics of complex molecular systems to a Markov jump pr... more A method is derived to coarse-grain the dynamics of complex molecular systems to a Markov jump process (MJP) describing how the system jumps between cells that fully partition its state space. The main inputs are relaxation times for each pair of cells, which are shown to be robust with respect to positioning of the cell boundaries. These relaxation times can be calculated via molecular dynamics simulations performed in each cell separately and are used in an efficient estimator for the rate matrix of the MJP. The method is illustrated through applications to Sinai billiards and a cluster of Lennard-Jones discs.
A method is derived to coarse-grain the dynamics of complex molecular systems to a Markov jump pr... more A method is derived to coarse-grain the dynamics of complex molecular systems to a Markov jump process (MJP) describing how the system jumps between cells that fully partition its state space. The main inputs are relaxation times for each pair of cells, which are shown to be robust with respect to positioning of the cell boundaries. These relaxation times can be calculated via molecular dynamics simulations performed in each cell separately and are used in an efficient estimator for the rate matrix of the MJP. The method is illustrated through applications to Sinai billiards and a cluster of Lennard-Jones discs.
This work presents a new method for calculating rate constants for configurational transitions de... more This work presents a new method for calculating rate constants for configurational transitions described in terms of a master equation. The method is based on con- straining molecular dynamics simulations to boxes in configuration space, and is also known as “boxed molecular dynamics”. Rate constants can be easily calculated even for systems deviating from an exponential distribution of the first passage times, as a result of the presence of internal barriers and roughness of the dividing surfaces. The theoretical justification of the method is based on the concept of mean first passage times. One of the assumptions of the reactive flux formulation is omitted; regression of the population evolution is used instead of calculation of the rate constant at a single point, so the distribution of first passage times is not required to be strictly exponential. The efficiency and correctness of the new method, boxed molecular dynamics in the first passage time formulation of the rate constants (FPT-BXD), is demonstrated for toy models. Preliminary results of simulations for a cluster of Lennard-Jones discs using FPT-BXD are discussed.
The Journal of Physical Chemistry B, 2012
Although a contact is an essential measurement for the topology as well as strength of non-covale... more Although a contact is an essential measurement for the topology as well as strength of non-covalent interactions in biomolecules and their complexes, there is no general agreement in the definition of this feature. Most of the definitions work with simple geometric criteria which do not fully reflect the energy content or ability of the biomolecular building blocks to arrange their environment. We offer a reasonable solution to this problem by distinguishing between "productive" and "non-productive" contacts based on their interaction energy strength and properties. We have proposed a method which converts the protein topology into a contact map that represents interactions with statistically significant high interaction energies. We do not prove that these contacts are exclusively stabilizing, but they represent a gateway to thermodynamically important rather than geometry-based contacts. The process is based on protein fragmentation and calculation of interaction energies using the OPLS force field and relies on pairwise additivity of amino acid interactions. Our approach integrates the treatment of different types of interactions, avoiding the problems resulting from different contributions to the overall stability and the different effect of the environment. The first applications on a set of homologous proteins have shown the usefulness of this classification for a sound estimate of protein stability.
Complex reaction networks (CRN) prove to be excellent models for complex chemical systems like ch... more Complex reaction networks (CRN) prove to be excellent models for complex chemical systems like chemical reactors or cellular compartments. Dynamics of the modeled sys- tem can be studied qualitatively using graph theoretical methods and convex analysis. In this work, methodology for qualitative analysis of CRN is developed and imple- mented in MATLAB/Octave. Emphasis is put on comfort of the user and developer, i.e. on straightforward usage and lucidity of the code. Program for decomposition of a network into extreme pathways and for determination of their stability is implemented based on literature. Algorithm for automatic classification of potential oscillators is invented and an efficient genetic algorithm for finding Hopf bifurcation is proposed. The developed software is used to analyze 5 representative relevant CRN models.
Folding free energy of a protein is a delicate balance between stabi- lizing and destabilizing no... more Folding free energy of a protein is a delicate balance between stabi- lizing and destabilizing non-covalent itneractions. In this work, we decompose folding free energy into physically meaningful contributions, in which we aim to find general trends. Empirical potential is used to calculate interaction ener- gy between all protein fragments, which are classified based on their dominant term in multipolar expansion. Calculations are done using 1200 non-redundant structures from PDB database. Based on general trends found in interactions between these fragments, we attempt to better understand relationships between interaction energies calculated using computational chemistry methods and their corresponding free energy contributions on stabilization.
An amino-acid in proteins shows two different, yet mutually dependent faces connected through the... more An amino-acid in proteins shows two different, yet mutually dependent faces connected through the polymer character of a protein in the final product. They are the amino-acid sidechain and its corresponding backbone part. On the level of the side-chains, we often refer to specific structural arrangements such as hydrophobic cluster motifs, salt-bridge motifs or hydrogen-bond motifs characterizing various parts of a protein and usually assigned to a certain function. The backbone on the other hand offers limited, yet general structural motifs - and random coil patterns. All of these mentioned amino-acid features contribute to the synergy demonstrated observably by protein stability and protein function. Thermal stability is one of the most important features of the structure of a fully folded protein. It is defined as the difference in the Gibbs free energy between its native and denaturated states and as such is a function of temperature and implicitly a function of protein composition and the effect of the environment. Nevertheless, it is necessary to say that for this function we do not know yet the precise and general form which could be applicable for a large set of proteins. There have been many attempts to propose an intuitive, yet productive decomposition of Gibbs free stabilization energy (GFSE) into simple terms. One of the scenarios utilized for such purposes is that the total free energy is the sum of the free energies of various atomic groups and the hydrophobic effect. However, as the free energy is not additive and the fractionation of free energy to independent terms is difficult, this attempt has been quite unsuccessful. The utilization of molecular modeling methodology and tools has opened a more systematic and perhaps more promising approach -the evaluation of the enthalpy term in the equation for Gibbs free energy with reasonable accuracy . The remaining entropy term could be obtained by fitting the corresponding analytical form to the experimental data. There are basically three different enthalpy contributions that we can separate. The first comes from the intramolecular interactions between the atoms of proteins, producing the largest stabilizing enthalpy contribution. The second comes from the interactions between the molecules of a solvent, and finally the third contribution is the result of the interactions between the atoms of the solute (protein) and the solvent.
The program takes system of ordinary differential equations with one parameter and starting point... more The program takes system of ordinary differential equations with one parameter and starting point on input and calculates one curve of stationary solutions and its stability. The algorithm was proposed by Kubicek in 1976. Let N be number of differential equations and M be number of steps of continuation. The code is O(N 3 M) complex in big O notation. It stores whole trajectory in RAM, therefore it has O(M) memory consumption for long trajectories (in usual cases M > N 2 ) and O(N 2 ) for short ones.
Interaction energy matrix concept is a promising approach to study protein folding, binding and f... more Interaction energy matrix concept is a promising approach to study protein folding, binding and function. The concept of hydrophobic core provides valuable opportunity to unify thermodynamic, kinetic, evolutionary and structural points of view. Method guaranteeing transferable and objective identification of key residues can make possible their further investigations and is the main purpose of this work.
The thesis introduces statistical mechanics, molecular modeling and structural biology backgrounds essential for theoretical modeling of the protein folding. Graph representation of proteins in energy space is utilized to characterize energy proportions of sidechains and importance of particular contact types. New definition of contact is proposed and finally, established energy quantities are used to determine residue importance.
PhD thesis, 2016
Simulating the dynamics of rare events is a great challenge for current computational chemistry. ... more Simulating the dynamics of rare events is a great challenge for current computational chemistry. It is generally accepted that some dynamical properties cannot be properly studied without propagation of reactive trajectories, yet most current approaches are derived from transition state theory (TST), calculating rates from a static picture of the reactant and the transition state. Instead of pursuing the observable timescales directly, simulated trajectories are used to calculate a correction factor to TST, which is exact at short timescales. Particularly in soft matter science,
where viscosity of the solvent slows down the dynamics and barriers between observable states can be low, TST often fails to give even qualitative agreement with experiments. Methods for calculating the correction factors are often complicated and inefficient.
The present work introduces an alternative approach, focused on observable timescales. Dynamical coarse-graining is instead based on relaxation between extracted pairs of cells. Using certain sets of assumptions, the relaxation approach is applied to two essential issues in modeling complex systems:
1. A complex energy landscape can be mapped to a chemical master equation with arbitrary state definitions using discrete relaxation path sampling (DRPS, chapter 3).
2. The dimensionality of large kinetic transition networks can be efficiently reduced using relaxed recursive regrouping (RRR, chapter 4).
The effectiveness of the methods is illustrated for toy models and the applicability to systems of general interest is demonstrated for several models recently studied by our colleagues.
The present work is based on three articles published in peer-reviewed journals, or currently being in the peer-review process:
1. In ‘‘Markov state modeling and dynamical coarse-graining via discrete relaxation path sampling’’, we propose a method for coarse-graining molecular dynamics to a chemical master equation. The definition of rate coefficients (elements of the rate matrix) is based on the pairwise relaxation and thus is robust with respect to positioning of boundaries between the cells representing the states.
2. In ‘‘Dynamical properties of two- and three-dimensional colloidal clusters of six particles’’, we use discrete relaxation path sampling to study properties of a colloidal cluster. These clusters are important model systems for studying self-assembly, and have been extensively studied both theoretically and experimentally.
3. In ‘‘Extracting of rates from kinetic transition networks: relaxed recursive
regrouping’’, the pairwise relaxation approach is used to develop an efficient method for reduction of kinetic transition networks to correctly reproduce processes occurring at long timescales, respecting the global topology of the network.
The purpose of the present work is to explain the theoretical basis for the published works, and discuss the potential pitfalls, promising future directions, and potential applications.