An Efficient Docking Algorithm Using Conserved Residue Information to Study Protein-Protein Interactions (original) (raw)
Related papers
Protein Science, 2005
Many protein-protein docking algorithms generate numerous possible complex structures with only a few of them resembling the native structure. The major challenge is choosing the near-native structures from the generated set. Recently it has been observed that the density of conserved residue positions is higher at the interface regions of interacting protein surfaces, except for antibody-antigen complexes, where a very low number of conserved positions is observed at the interface regions. In the present study we have used this observation to identify putative interacting regions on the surface of interacting partners. We studied 59 protein complexes, used previously as a benchmark data set for docking investigations. We computed conservation indices of residue positions on the surfaces of interacting proteins using available homologous sequences and used this information to filter out from 56% to 86% of generated docked models, retaining near-native structures for further evaluation. We used a reverse filter of conservation score to filter out the majority of nonnative antigen-antibody complex structures. For each docked model in the filtered subsets, we relaxed the conformation of the side chains by minimizing the energy with CHARMM, and then calculated the binding free energy using a generalized Born method and solvent-accessible surface area calculations. Using the free energy along with conservation information and other descriptors used in the literature for ranking docking solutions, such as shape complementarity and pair potentials, we developed a global ranking procedure that significantly improves the docking results by giving top ranks to near-native complex structures.
Residue conservation information for generating near-native structures in protein-protein docking
Journal of bioinformatics and computational biology, 2006
Protein-protein docking algorithms typically generate large numbers of possible complex structures with only a few of them resembling the native structure. Recently (Duan et al., Protein Sci, 14:316-218, 2005), it was observed that the surface density of conserved residue positions is high at the interface regions of interacting protein surfaces, except for antibody-antigen complexes, where a lesser number of conserved positions than average is observed at the interface regions. Using this observation, we identified putative interacting regions on the surface of interacting partners and significantly improved docking results by assigning top ranks to near-native complex structures. In this paper, we combine the residue conservation information with a widely used shape complementarity algorithm to generate candidate complex structures with a higher percentage of near-native structures (hits). What is new in this work is that the conservation information is used early in the generatio...
A new protein-protein docking scoring function based on interface residue properties
Bioinformatics/computer Applications in The Biosciences, 2007
Motivation: Protein-protein complexes are known to play key roles in many cellular processes. However, they are often not accessible to experimental study because of their low stability and difficulty to produce the proteins and assemble them in native conformation. Thus, docking algorithms have been developed to provide an in silico approach of the problem. A protein-protein docking procedure traditionally consists of two successive tasks: a search algorithm generates a large number of candidate solutions, and then a scoring function is used to rank them. Results: To address the second step, we developed a scoring function based on a Voronoï tessellation of the protein threedimensional structure. We showed that the Voronoï representation may be used to describe in a simplified but useful manner, the geometric and physico-chemical complementarities of two molecular surfaces. We measured a set of parameters on native protein-protein complexes and on decoys, and used them as attributes in several statistical learning procedures: a logistic function, Support Vector Machines (SVM), and a genetic algorithm. For the later, we used ROGER, a genetic algorithm designed to optimize the area under the receiver operating characteristics curve. To further test the scores derived with ROGER, we ranked models generated by two different docking algorithms on targets of a blind prediction experiment, improving in almost all cases the rank of native-like solutions.
Refinement of unbound protein docking studies using biological knowledge
Proteins: Structure, Function, and Bioinformatics, 2005
In this work we present two methods for the reranking of protein-protein docking studies. One scoring method searches the InterDom database for domains that are available in the proteins to be docked and evaluates the interaction of these domains in other complexes of known structure. The second one analyzes the interface of each proposed conformation with regard to the conservation of Phe, Met, and Trp and their polar neighbor residues. The special relevance of these residues is based on a publication by Ma et al. (Proc Natl Acad Sci USA 2003;100:5772-5777), who compared the conservation of all residues in the interface region to the conservation on the rest of the protein's surface. The scoring functions were tested on 30 unbound docking test cases. The evaluation of the methods is based on the ability to rerank the output of a Fast Fourier Transformation (FFT) docking. Both were able to improve the ranking of the docking output. The best improvement was achieved for enzymeinhibitor examples. Especially the domain-based scoring function was successful and able to place a near-native solution on one of the first six ranks for 13 of 17 (76%) enzyme-inhibitor complexes [in 53% (nine complexes) even on the first rank]. The method evaluating residue conservation allowed us to increase the number of good solutions within the first 100 ranks out of ϳ9000 in 82% of the 17 enzymeinhibitor test cases, and for seven (41%) out of 17 enzyme-inhibitor complexes, a near native solution was placed within the first seven ranks. Proteins 2005;61:1059 -1067.
Classification of protein complexes based on docking difficulty
Proteins: Structure, Function, and Bioinformatics, 2005
Based on the results of several groups using different docking methods, the key properties that determine the expected success rate in protein-protein docking calculations are measures of conformational change, interface area, and hydrophobicity. A classification of protein complexes in terms of these measures provides a prediction of docking difficulty. This classification is used to study the targets of the CAPRI docking experiment. Results show that targets with a moderate expected difficulty were indeed predicted well by a number of groups, whereas the use of additional a priori information was necessary to obtain good results for some very difficult targets. The analysis indicates that CAPRI and other relatively largescale docking studies represent very important steps toward understanding the capabilities and limitations of current protein-protein docking methods. Proteins 2005;60:176 -180.
Protein-protein docking with multiple residue conformations and residue substitutions
Protein Science, 2002
The protein docking problem has two major aspects: sampling conformations and orientations, and scoring them for fit. To investigate the extent to which the protein docking problem may be attributed to the sampling of ligand side-chain conformations, multiple conformations of multiple residues were calculated for the uncomplexed (unbound) structures of protein ligands. These ligand conformations were docked into both the complexed (bound) and unbound conformations of the cognate receptors, and their energies were evaluated using an atomistic potential function. The following questions were considered: (1) does the ensemble of precalculated ligand conformations contain a structure similar to the bound form of the ligand? (2) Can the large number of conformations that are calculated be efficiently docked into the receptors? (3) Can near-native complexes be distinguished from non-native complexes? Results from seven test systems suggest that the precalculated ensembles do include side-chain conformations similar to those adopted in the experimental complexes. By assuming additivity among the side chains, the ensemble can be docked in less than 12 h on a desktop computer. These multiconformer dockings produce near-native complexes and also non-native complexes. When docked against the bound conformations of the receptors, the near-native complexes of the unbound ligand were always distinguishable from the non-native complexes. When docked against the unbound conformations of the receptors, the near-native dockings could usually, but not always, be distinguished from the non-native complexes. In every case, docking the unbound ligands with flexible side chains led to better energies and a better distinction between near-native and non-native fits. An extension of this algorithm allowed for docking multiple residue substitutions (mutants) in addition to multiple conformations. The rankings of the docked mutant proteins correlated with experimental binding affinities. These results suggest that sampling multiple residue conformations and residue substitutions of the unbound ligand contributes to, but does not fully provide, a solution to the protein docking problem. Conformational sampling allows a classical atomistic scoring function to be used; such a function may contribute to better selectivity between near-native and non-native complexes. Allowing for receptor flexibility may further extend these results.
ClusPro: an automated docking and discrimination method for the prediction of protein complexes
Bioinformatics, 2004
Predicting protein interactions is one of the most challenging problems in functional genomics. Given two proteins known to interact, current docking methods evaluate billions of docked conformations by simple scoring functions, and in addition to near-native structures yield many false positives, i.e. structures with good surface complementarity but far from the native. Results: We have developed a fast algorithm for filtering docked conformations with good surface complementarity, and ranking them based on their clustering properties. The free energy filters select complexes with lowest desolvation and electrostatic energies. Clustering is then used to smooth the local minima and to select the ones with the broadest energy wells-a property associated with the free energy at the binding site. The robustness of the method was tested on sets of 2000 docked conformations generated for 48 pairs of interacting proteins. In 31 of these cases, the top 10 predictions include at least one near-native complex, with an average RMSD of 5 Å from the native structure. The docking and discrimination method also provides good results for a number of complexes that were used as targets in the Critical Assessment of PRedictions of Interactions experiment. Availability: The fully automated docking and discrimination server ClusPro can be found at http://structure.