Prediction of the folding of short polypeptide segments in proteins by systematic search (original) (raw)
Related papers
Prediction of the folding of short polypeptide segments by uniform conformational sampling
Biopolymers, 1987
A procedure, CONGEN, for uniformly sampling the conformationd space of short polypeptide segments in proteins has been implemented. Because the time required for this sampling grows exponentially with the number of residues, parameters are introduced to limit the conformationd space that has to be explored. This is done by the use of the empirical energy function of CHARMM (1983) J . Comput. Chem. 4, 187-2171 and truncating the search when conformations of grossly unfavorable energy are sampled. Tests are made to determine control parameters that optimize the search without excluding important configurations. When applied to known protein structures, the resulting procedure is generally capable of generating conformations where the lowest energy conformation matches the known structure within a rms deviation of 1 A.
Guiding Probabilistic Search of the Protein Conformational Space With Structural Profiles
J. Bioinform. Comput. Biol., 2012
The roughness of the protein energy surface poses a signi¯cant challenge to search algorithms that seek to obtain a structural characterization of the native state. Recent research seeks to bias search toward near-native conformations through one-dimensional structural pro¯les of the protein native state. Here we investigate the e®ectiveness of such pro¯les in a structure prediction setting for proteins of various sizes and folds. We pursue two directions. We¯rst investigate the contribution of structural pro¯les in comparison to or in conjunction with physics-based energy functions in providing an e®ective energy bias. We conduct this investigation in the context of Metropolis Monte Carlo with fragment-based assembly. Second, we explore the e®ectiveness of structural pro¯les in providing projection coordinates through which to organize the conformational space. We do so in the context of a robotics-inspired search framework proposed in our lab that employs projections of the conformational space to guide search. Our¯ndings indicate that structural pro¯les are most e®ective in obtaining physically realistic near-native conformations when employed in conjunction with physics-based energy functions. Our¯ndings also show that these pro¯les are very e®ective when employed instead as projection coordinates to guide probabilistic search toward undersampled regions of the conformational space.
A fast conformational search strategy for finding low energy structures of model proteins
Protein Science, 1996
We describe a new computer algorithm for finding low-energy conformations of proteins. It is a chain-growth method that uses a heuristic bias function to help assemble a hydrophobic core. We call it the Core-directed chain Growth method (CG). We test the CG method on several well-known literature examples of HP lattice model proteins [in which proteins are modeled as sequences of hydrophobic (H) and polar (P) monomers], ranging from 20-64 monomers in two dimensions, and up to 88-mers in three dimensions. Previous nonexhaustive methods-Monte Carlo, a Genetic Algorithm, Hydrophobic Zippers. and Contact
Conformational subspace in simulation of early‐stage protein folding
Proteins: Structure, Function, and Bioinformatics, 2004
A probability calculus was used to simulate the early stages of protein folding in ab initio structure prediction. The probabilities of particular ϕ and ψ angles for each of 20 amino acids as they occur in crystal forms of proteins were used to calculate the amount of information necessary for the occurrence of given ϕ and ψ angles to be predicted. It was found that the amount of information needed to predict ϕ and ψ angles with 5° precision is much higher than the amount of information actually carried by individual amino acids in the polypeptide chain. To handle this problem, a limited conformational space for the preliminary search for optimal polypeptide structure is proposed based on a simplified geometrical model of the polypeptide chain and on the probability calculus. These two models, geometric and probabilistic, based on different sources, yield a common conclusion concerning how a limited conformational space can represent an early stage of polypeptide chain‐folding simul...
The two aspects of the protein folding problem
A physics-based approach to the protein folding problem is presented. It is concerned with the computation of folding pathways and final native structures, given the amino acid sequences, an empirical all-atom potential energy function, and a procedure to identify the global minimum of the potential energy. Whereas the all-atom approach has provided three-dimensional structures of relatively small molecules and for helical proteins containing up to 46 residues, it has been necessary to develop a hierarchical approach to treat larger proteins. In the hierarchical approach, global optimization was originally carried out with a simplified united residue (UNRES) description of a polypeptide chain to locate the region in which the global minimum lies. Conversion of the UNRES structures in this region to all-atom structures is followed by a local search in this region. The performance of this physics-based approach in successive CASP blind tests for predicting protein structure is described. More recently, a molecular dynamics treatment with UNRES has been introduced to compute not only native structures but also folding pathways.
Folding proteins with a simple energy function and extensive conformational searching
Protein Science, 2008
We describe a computer algorithm for predicting the three-dimensional structures of proteins using only their amino acid sequences. The method differs from others in two ways: (1) it uses very few energy parameters, representing hydrophobic and polar interactions, and (2) it uses a new "constraint-based exhaustive" searching method, which appears to be among the fastest and most complete search methods yet available for realistic protein models. It finds a relatively small number of low-energy conformations, among which are native-like conformations, for crambin (lCRN), avian pancreatic polypeptide (lPPT), melittin (2MLT), and apamin. Thus, the lowest-energy states of very simple energy functions may predict the native structures of globular proteins. These methods assume that a native protein structure is a balance of many different interactions, usually characterized using hundreds to several thousands of "knowledge-based" energy parameters derived from databases of known protein structures. Based on the thermodynamic hypothesis (Anfinsen, 1973) that the native three-dimensional structure of a protein is the state of lowest free energy, these algorithms explore many different protein conformations by sampling methods, such as Monte Carlo, molecular mechanics, or molecular dynamics, to find the most stable conformations. Each such method correctly predicts a few protein structures but misses many others.
Limited conformational space for early-stage protein folding simulation
Bioinformatics, 2004
Motivation: The problem of early-stage protein folding is critical for protein structure prediction. The model presented introduces a common definition of protein structures which may be treated as the possible in silico early-stage form of the polypeptide chain. Limitation of the conformational space to the ellipse path on the Ramachandran map was tested as a possible sub-space to represent the early-stage structure for simulation of protein folding. The proposed conformational sub-space was developed on the basis of the backbone conformation, with side-chain interactions excluded. Results: The ellipse-path-limited conformation of BPTI was created using the criterion of shortest distance between Phi, Psi angles in native form of protein and the Phi, Psi angles belonging to the ellipse. No knots were observed in the structure created according to ellipse-path conformational sub-space. The energy minimization procedure applied to ellipse-path derived conformation directed structural changes toward the native form of the protein with SS-bonds system introduced to the procedure.
Computational studies of protein folding
Computing in Science & Engineering, 2001
P roteins are the workhorses of life. These polymers, comprised of 20 naturally occurring amino acids, fold to a unique, biologically active conformation called the native state. Various genome-sequencing projects now list the parts of such protein sequences in a given organism, but unfortunately, this list is of little utility; the real need is to identify the functions of all these proteins, which range from molecular to physiological to phenotypical. For between 40 to 60 percent of the protein-coding regions (or open reading frames), sequence-based methods that exploit evolutionary information can provide insight into some aspect of biological function. Such alignment methods define the standard against which we must measure all alternative approaches, 1,2 but such approaches increasingly fail as the protein families become more distant. The remaining unassigned open reading frames represent an important challenge, and structure-based approaches to function prediction can play a significant role, 3,4 especially in target selection for genomics projects. The ultimate goal for most such projects is to experimentally determine the structure of all possible protein folds so that any newly found sequence is within modeling distance of an already solved structure. In this article, we examine the status of contemporary protein structure prediction approaches.
Calculation of protein conformation by global optimization of a potential energy function
Proteins-structure Function and Bioinformatics, 1999
A novel hierarchical approach to protein folding has been applied to compute the unknown structures of seven target proteins provided by CASP3. The approach is based exclusively on the global optimization of a potential energy function for a united-residue model by conformational space annealing, followed by energy refinement using an all-atom potential. Comparison of the submitted models for five globular proteins with the experimental structures shows that the conformations of large fragments (ϳ60 aa) were predicted with rmsds of 4.2-6.8 Å for the C ␣ atoms. Our lowest-energy models for targets T0056 and T0061 were particularly successful, producing the correct fold of approximately 52% and 80% of the structures, respectively. These results support the thermodynamic hypothesis that protein structure can be computed solely by global optimization of a potential energy function for a given amino acid sequence.