New joint prediction algorithm (Q7-JASEP) improves the prediction of protein secondary structure (original) (raw)

The classical problem of secondary structure prediction is approached by a new joint algorithm (Q,-JASEP) that combines the best aspects of six different methods. The algorithm includes the statistical methods of Chou-Fasman, Nagano, and Burgess-Ponnuswamy-Scheraga, the homology method of Nishikawa, the information theory method of Garnier-Osgurthope-Robson, and the artificial neural network approach of Qian-Sejnowski. Steps in the algorithm are (i) optimizing each individual method with respect to its correlation coefficient (Q7) for assigning a structural type from the predictive score of the method, (ii) weighting each method, (iii) combining the scores from different methods, and (iv) comparing the scores for a-helix, P-strand, and coil conformational states to assign the secondary structure at each residue position. The present application to 45 globular proteins demonstrates good predictive power in cross-validation testing (with average correlation coefficients per test protein of Q7,a = 0.41, Q7,@ = 0.47, Q7,c = 0.41 for a-helix, fl-strand, and coil conformations). By the criterion of correlation coefficient (e7) for each type of secondary structure, Q 7 -~~~~~ performs better than any of the component methods. When all protein classes are included for training and testing (by cross-validation), the results here equal the best in the literature, by the Q7 criterion. More generally, the basic algorithm can be applied to any protein class and to any type of structure/sequence or function/sequence correlation for which multiple predictive methods exist.