Computerised version of the Chou and Fasman protein secondary structure predictive method (original) (raw)

A Simple Comparison between Specific Protein Secondary Structure Prediction Tools

Tropical Agricultural Research, 2012

A comparative evaluation of five widely used protein secondary structure prediction programs available in World Wide Web was carried out. Secondary structure data of ten proteins containing 190 secondary structure motifs were collected from Protein Data Bank (PDB). The amino acid sequences of the proteins were then evaluated using GOR, PSIPRED, HNN, PROF, and YASPIN secondary structure prediction tools and the results were compared with the structural information obtained from PDB. The study reveals considerable differences between results obtained from each program. Within the limit of this comparative study, PSIPRED showed the highest prediction accuracy with 77 % accuracy in α helix prediction and 70 % accuracy in β strand prediction. Furthermore, the level of accuracy varied with the length of the secondary structure motifs. Highest accuracies were obtained for α helices of 16-20 amino acids and β strands of 7-9 amino acids in length. The results suggest that, among the most frequently used software programs available in World Wide Web, PSIPRED is the tool that gives the best results for secondary structure prediction.

Prediction of secondary structures of proteins using a two-stage method

Computers & Chemical …, 2008

Protein structure determination and prediction has been a focal research subject in life sciences due to the importance of protein structure in understanding the biological and chemical activities of organisms. The experimental methods used to determine the structures of proteins demand sophisticated equipment and time. A host of computational methods are developed to predict the location of secondary structure elements in proteins for complementing or creating insights into experimental results. However, prediction accuracies of these methods rarely exceed 70%. In this paper, a novel two-stage method to predict the location of secondary structure elements in a protein using the primary structure data only is presented. In the first stage of the proposed method, the folding type of a protein is determined using a novel classification approach for multi-class problems. The second stage of the method utilizes data available in the Protein Data Bank and determines the possible location of secondary structure elements in a probabilistic search algorithm. It is shown that the average accuracy of the predictions is 74.1% on a large structure dataset.

A Comparative Study of Protein Tertiary Structure Prediction Methods

International Journal of Computer Science and Informatics, 2014

Protein structure prediction (PSP) from amino acid sequence is one of the high focus problems in bioinformatics today. This is due to the fact that the biological function of the protein is determined by its three dimensional structure. The understanding of protein structures is vital to determine the function of a protein and its interaction with DNA, RNA and enzyme. Thus, protein structure is a fundamental area of computational biology. Its importance is intensed by large amounts of sequence data coming from PDB (Protein Data Bank) and the fact that experimentally methods such as X-ray crystallography or Nuclear Magnetic Resonance (NMR)which are used to determining protein structures remains very expensive and time consuming. In this paper, different types of protein structures and methods for its prediction are described.

Prediction of protein secondary structures using a combined method based on the recognition, lim and garnier-osguthorpe-robson algorithms

Journal of Molecular Structure: THEOCHEM, 1991

The prediction of the secondary structure of a protein, discussed in this paper, is based on the algorithm developed by Lim, extensively modified and coupled with the minimum recognition method developed in this laboratory, as well as the Gamier-Osguthorpe-Robson method. The modified Lim algorithm identifies eight types of pattern as possibly having structural character when given conditions are satisfied. The preliminary prediction of the a! helices and /3 chains is carried out by inspection of the residue sequence, identifying first the (Y helices and then the jl chains. The final prediction is then obtained by combining, according to well-defined rules, the above results with those from the minimum recognition and the Gamier-Osguthorpe-Robson methods. The complete algorithm has been developed through parameterization for 80 proteins with known secondary structure. The usefulness of its predictive power, as a fit step towards the computer simulation of a tertiary structure, becomes evident from the following results: 74% (57%) of the predicted rx helices (/I chains) are found in correspondence with corresponding experimental fragments, with a sequence position overlap of 78% (66%) .

Combined Approach to Protein Secondary Structure Prediction

2004

Motivation: The most important achievements in protein secondary structure prediction are based on two different approaches. The first one is the statistical approach and the second one is the physical-chemical approach. In the first approach we analyze appearance of different types of amino acids in given conformations. The second approach use conformational calculations and physical-chemical properties of a given molecule. Presently these approaches are developed independently. The creation of the method, which will contain advantages of the statistical and the physical-chemical approaches, is very important task. Results: We have developed the new approach for secondary structure prediction. Using our combined approach one can obtain the secondary structure of a given protein from its primary structure only. The base of our method is a joint using advantages provided by conformational calculations, data on primary structure and physical-chemical properties of proteins. For the combined approach to be demonstrated, we have predicted the protein secondary structure of four basic types: α-helix, helix 3/10, coil, and turn using only sequences of given proteins from Protein Data Bank.

Experimental Evaluation of Protein Secondary Structure Predictors

Lecture Notes in Computer Science, 2009

Understanding protein biological function is a key issue in modern biology, which is largely determined by its 3D shape. Protein 3D shape, in its turn, is functionally implied by its amino acid sequence. Since the direct inspection of such 3D structures is rather expensive and time consuming, a number of software techniques have been developed in the last few years that predict a spatial model, either of the secondary or of the tertiary form, for a given target protein starting from its amino acid sequence. This paper offers a comparison of several available automatic secondary structure prediction tools. The comparison is of the experimental kind, where two relevant sets of proteins, a non-redundant one including 100 elements, and a 180-protein set taken from the CASP 6 contest, were used as test cases. Comparisons have been based on evaluating standard quality measures, such as the Q3 and SOV.

New joint prediction algorithm (Q7-JASEP) improves the prediction of protein secondary structure

Biochemistry, 1991

The classical problem of secondary structure prediction is approached by a new joint algorithm (Q,-JASEP) that combines the best aspects of six different methods. The algorithm includes the statistical methods of Chou-Fasman, Nagano, and Burgess-Ponnuswamy-Scheraga, the homology method of Nishikawa, the information theory method of Garnier-Osgurthope-Robson, and the artificial neural network approach of Qian-Sejnowski. Steps in the algorithm are (i) optimizing each individual method with respect to its correlation coefficient (Q7) for assigning a structural type from the predictive score of the method, (ii) weighting each method, (iii) combining the scores from different methods, and (iv) comparing the scores for a-helix, P-strand, and coil conformational states to assign the secondary structure at each residue position. The present application to 45 globular proteins demonstrates good predictive power in cross-validation testing (with average correlation coefficients per test protein of Q7,a = 0.41, Q7,@ = 0.47, Q7,c = 0.41 for a-helix, fl-strand, and coil conformations). By the criterion of correlation coefficient (e7) for each type of secondary structure, Q 7 -~~~~~ performs better than any of the component methods. When all protein classes are included for training and testing (by cross-validation), the results here equal the best in the literature, by the Q7 criterion. More generally, the basic algorithm can be applied to any protein class and to any type of structure/sequence or function/sequence correlation for which multiple predictive methods exist.

A STUDY OF INTELLIGENT TECHNIQUES FOR PROTEIN SECONDARY STRUCTURE PREDICTION

Protein secondary structure prediction has been and will continue to be a rich research field. This is because the protein structure and shape directly affect protein behavior. Moreover, the number of known secondary and tertiary structures versus primary structures is relatively small. Although the secondary prediction started in the seventies but it has been together with the tertiary structure prediction a topic that is always under research. This paper presents a technical study on recent methods used for secondary structure prediction using amino acid sequence. The methods are studied along with their accuracy levels. The most known methods like Neural Networks and Support Vector Machines are shown and other techniques as well. The paper shows different approaches for predicting the protein structures that showed different accuracies that ranged from 50% to over than 90%. The most commonly used technique is Neural Networks. However, Case Based Reasoning and Mixed Integer Linear Optimization showed the best accuracy among the machine learning techniques and provided accuracy of approximately 83%.