Use of a novel Hill-climbing genetic algorithm in protein folding simulations (original) (raw)

Genetic Algorithms for Protein Folding Simulations

Journal of Molecular Biology, 1993

Genetic algorithms methods utilize the same optimization procedures as natural genetic evolution, in which a population is gradually improved by selection. We have developed a genetic algorithm search procedure suitable for use in protein folding simulations. A population of conformations of the polypeptide chain is maintained, and conformations are changed bx mutation, in the form of conventional Monte Carlo steps, and crossovers in which parts of the polypeptide chain are interchanged between conformations. For folding on a simple two-dimensional lattice it is found that the genetic algorithm is dramatically superior to conventional Monte Carlo methods.

Development and optimisation of a novel genetic algorithm for studying model protein folding

Theoretical Chemistry Accounts, 2004

Determination of the native state of a protein from its amino acid sequence is the goal of protein folding simulations, with potential applications in gene therapy and drug design. Location of the global minimum structure for a given sequence, however, is a difficult optimisation problem. In this paper, we describe the development and application of a genetic algorithm (GA) to find the lowest-energy conformations for the 2D HP lattice bead protein model. Optimisation of the parameters of our ''standard'' GA program reveals that the GA is most successful (at finding the lowest-energy conformations) for high rates of mating and mutation and relatively high elitism. We have also introduced a number of new genetic operators: a duplicate predator-which maintains population diversity by eliminating duplicate structures; brood selection-where two ''parent'' structures undergo crossover and give rise to a brood of (not just two) offspring; and a Monte Carlo based local search algorithm-to explore the neighbourhood of all members of the population. It is shown that these operators lead to significant improvements in the success and efficiency of the GA, both compared with our standard GA and with previously published GA studies for benchmark HP sequences with up to 50 beads.

Evolutionary Computer Programming of Protein Folding and Structure Predictions

Journal of Theoretical Biology, 2004

In order to understand the mechanism of protein folding and to assist the rational de-novo design of fast-folding, non-aggregating and stable artificial enzymes it is very helpful to be able to simulate protein folding reactions and to predict the structures of proteins and other biomacromolecules. Here, we use a method of computer programming called ''evolutionary computer programming'' in which a program evolves depending on the evolutionary pressure exerted on the program. In the case of the presented application of this method on a computer program for folding simulations, the evolutionary pressure exerted was towards faster finding deep minima in the energy landscape of protein folding. Already after 20 evolution steps, the evolved program was able to find deep minima in the energy landscape more than 10 times faster than the original program prior to the evolution process.

Computational studies of protein folding

Computing in Science & Engineering, 2001

P roteins are the workhorses of life. These polymers, comprised of 20 naturally occurring amino acids, fold to a unique, biologically active conformation called the native state. Various genome-sequencing projects now list the parts of such protein sequences in a given organism, but unfortunately, this list is of little utility; the real need is to identify the functions of all these proteins, which range from molecular to physiological to phenotypical. For between 40 to 60 percent of the protein-coding regions (or open reading frames), sequence-based methods that exploit evolutionary information can provide insight into some aspect of biological function. Such alignment methods define the standard against which we must measure all alternative approaches, 1,2 but such approaches increasingly fail as the protein families become more distant. The remaining unassigned open reading frames represent an important challenge, and structure-based approaches to function prediction can play a significant role, 3,4 especially in target selection for genomics projects. The ultimate goal for most such projects is to experimentally determine the structure of all possible protein folds so that any newly found sequence is within modeling distance of an already solved structure. In this article, we examine the status of contemporary protein structure prediction approaches.

Niche Genetic Algorithms are better than traditional Genetic Algorithms for de novo Protein Folding

F1000Research, 2014

Here we demonstrate that Niche Genetic Algorithms (NGA) are better at computing de novo protein folding than traditional Genetic Algorithms (GA). Previous research has shown that proteins can fold into their active forms in a limited number of ways; however, predicting how a set of amino acids will fold starting from the primary structure is still a mystery. GAs have a unique ability to solve these types of scientific problems because of their computational efficiency. Unfortunately, GAs are generally quite poor at solving problems with multiple optima. However, there is a special group of GAs called Niche Genetic Algorithms (NGA) that are quite good at solving problems with multiple optima. In this study, we use a specific NGA: the Dynamic-radius Species-conserving Genetic Algorithm (DSGA), and show that DSGA is very adept at predicting the folded state of proteins, and that DSGA is better than a traditional GA in deriving the correct folding pattern of a protein.

Protein Folding Prediction with Genetic Algorithms

Hydrophobic-hydrophilic model (HP model) is one of the most simplified and popular protein fold- ing models. This model considers the hydrophobic- hydrophobic interactions of protein structures, but the results of prediction are not encouraged enough. Therefore, we suggest that some other features should be considered, such as SSEs, charges, and disulfide bonds. In this paper we propose a genetic algorithm (GA) with more possible considerations based on the lattice model to predict the 3D struc- ture of an unknown protein, target protein, whose primary sequence and secondary structure elements (SSEs) are assumed known. Experimental results show that these additional features indeed improve the prediction accuracy by comparing our predic- tion results with their real structures with RMSD.

Generating folded protein structures with a lattice chain growth algorithm

The Journal of Chemical Physics, 2000

We present a new application of the chain growth algorithm to lattice generation of protein structure and thermodynamics. Given the difficulty of ab initio protein structure prediction, this approach provides an alternative to current folding algorithms. The chain growth algorithm, unlike Metropolis folding algorithms, generates independent protein structures to achieve rapid and efficient exploration of configurational space. It is a modified version of the Rosenbluth algorithm where the chain growth transition probability is a normalized Boltzmann factor; it was previously applied only to simple polymers and protein models with two residue types. The independent protein configurations, generated segment-by-segment on a refined cubic lattice, are based on a single interaction site for each amino acid and a statistical interaction energy derived by Miyazawa and Jernigan. We examine for several proteins the algorithm's ability to produce nativelike folds and its effectiveness for calculating protein thermodynamics. Thermal transition profiles associated with the internal energy, entropy, and radius of gyration show characteristic folding/unfolding transitions and provide evidence for unfolding via partially unfolded ͑molten-globule͒ states. From the configurational ensembles, the protein structures with the lowest distance root-mean-square deviations ͑dRMSD͒ vary between 2.2 to 3.8 Å, a range comparable to results of an exhaustive enumeration search. Though the ensemble-averaged dRMSD values are about 1.5 to 2 Å larger, the lowest dRMSD structures have similar overall folds to the native proteins. These results demonstrate that the chain growth algorithm is a viable alternative to protein simulations using the whole chain.

Niche Genetic Algorithms are better than traditional Genetic Algorithms for Protein Folding de novo [v1; ref status: awaiting peer review, http://f1000r.es/4gk

Here we demonstrate that Niche Genetic Algorithms (NGA) are better at computing protein folding than traditional Genetic Algorithms (GA). de novo Previous research has shown that proteins can fold into their active forms in a limited number of ways; however, predicting how a set of amino acids will fold starting from the primary structure is still a mystery. GAs have a unique ability to solve these types of scientific problems because of their computational efficiency. Unfortunately, GAs are generally quite poor at solving problems with multiple optima. However, there is a special group of GAs called Niche Genetic Algorithms (NGA) that are quite good at solving problems with multiple optima. In this study, we use a specific NGA: the Dynamic-radius Species-conserving Genetic Algorithm (DSGA), and show that DSGA is very adept at predicting the folded state of proteins, and that DSGA is better than a traditional GA in deriving the correct folding pattern of a protein.

Efficiency of Parallel Genetic Algorithms for Protein Folding on the 2-D HP Model

In the paper, we study the problem of using parallel genetic algorithms (PGAs) for protein-folding on the 2D HP model. In particular, parallelism introduces additional costs such as communications latencies and bottlenecks that a ect performance. We focus on the theoretical analysis of running times for several well-known parallel con gurations including master-slave, ne-grained, coarse grained and their variants. From performance data gathered, the theoretical analysis presented has been shown to successfully predict the running times.

HP Model Protein Folding with Hybrid Algorithm using Genetic Algorithm and Estimation of Distribution Algorithm

This paper describes a hybrid algorithm of Genetic Algorithm (GA) and Estimation of Distribution Algorithm (EDA) to solve the Protein Structure Prediction (PSP) based on lattice Hydrophobic-Polar (HP) models. This system is a hybrid algorithm using GA and EDA. In the system, network constructed by EDA is used in GA to generate effective gene. PSP is one of the challenging problems in bioinformatics. The goal of the problem is to predict the conformation from the given amino acid sequence. However even for a small number of amino acids, the solution space is huge. This paper introduces experimental data about PSP problem on lattice HP models, and the experimental results showed that proposed system searched solutions effectively compare with single population algorithm. These results are concluded that proposed method works effectively for searching candidate solution.