Yubei Zhu - Profile on Academia.edu (original) (raw)
Related Authors
Uploads
Papers by Yubei Zhu
Prediction of Interaction Between Enzymes and Small Molecules in Metabolic Pathways Through Integrating Multiple Classifiers
Protein & Peptide Letters, 2010
Information about interactions between enzymes and small molecules is important for understanding... more Information about interactions between enzymes and small molecules is important for understanding various metabolic bioprocesses. In this article we applied a majority voting system to predict the interactions between enzymes and small molecules in the metabolic pathways, by combining several classifiers including AdaBoost, Bagging and KNN together. The advantage of such a strategy is based on the principle that a predictor based majority voting systems usually provide more reliable results than any single classifier. The prediction accuracies thus obtained on a training dataset and an independent testing dataset were 82.8% and 84.8%, respectively. The prediction accuracy for the networking couples in the independent testing dataset was 75.5%, which is about 4% higher than that reported in a previous study. The web-server for the prediction method presented in this paper is available at http://chemdata.shu.edu.cn/small-enz.
Chemometrics and Intelligent Laboratory Systems, 2011
Correctly predicting the site of O-glycosylation will greatly benefit the search and design of ne... more Correctly predicting the site of O-glycosylation will greatly benefit the search and design of new specific and efficient GalNAc-transferase inhibitors. In this article, the site of O-glycosylation was studied using the correlation-based feature subset (CfsSubset) selection method combined with a wrapper method. Twenty-three important biochemical features were found based on a jackknife test from original data set containing 4779 features. By using the AdaBoost method with the twenty-three selected features, the prediction model yields an accuracy rate of 88.1% for the jackknife test and 87.5% for an independent set test, with increased accuracy over the original dataset by 8.5% and 10.42%, respectively. It is expected that our feature selection scheme can be referred to as a useful assistant technique for finding effective competitive inhibitors of GalNAc-transferase. An online predictor based on this research is available at .
Biochemical and Biophysical Research Communications, 2009
Protein-protein interactions (PPIs) play a crucial role in various biological processes. To bette... more Protein-protein interactions (PPIs) play a crucial role in various biological processes. To better comprehend the pathogenesis and treatments of various diseases, it is necessary to learn the detail of these interactions. However, the current experimental method still has many false-positive and false-negative problems. Computational prediction of protein-protein interaction has become a more important prediction method which can overcome the obstacles of the experimental method. In this work, we proposed a novel computational domain-based method for PPI prediction, and an SVM model for the prediction was built based on the physicochemical property of the domain. e outcomes of SVM and the domain-domain score were used to construct the prediction model for protein-protein interaction. e predicted results demonstrated the domain-based research can enhance the ability to predict protein interactions.
Prediction of Interaction Between Enzymes and Small Molecules in Metabolic Pathways Through Integrating Multiple Classifiers
Protein & Peptide Letters, 2010
Information about interactions between enzymes and small molecules is important for understanding... more Information about interactions between enzymes and small molecules is important for understanding various metabolic bioprocesses. In this article we applied a majority voting system to predict the interactions between enzymes and small molecules in the metabolic pathways, by combining several classifiers including AdaBoost, Bagging and KNN together. The advantage of such a strategy is based on the principle that a predictor based majority voting systems usually provide more reliable results than any single classifier. The prediction accuracies thus obtained on a training dataset and an independent testing dataset were 82.8% and 84.8%, respectively. The prediction accuracy for the networking couples in the independent testing dataset was 75.5%, which is about 4% higher than that reported in a previous study. The web-server for the prediction method presented in this paper is available at http://chemdata.shu.edu.cn/small-enz.
Chemometrics and Intelligent Laboratory Systems, 2011
Correctly predicting the site of O-glycosylation will greatly benefit the search and design of ne... more Correctly predicting the site of O-glycosylation will greatly benefit the search and design of new specific and efficient GalNAc-transferase inhibitors. In this article, the site of O-glycosylation was studied using the correlation-based feature subset (CfsSubset) selection method combined with a wrapper method. Twenty-three important biochemical features were found based on a jackknife test from original data set containing 4779 features. By using the AdaBoost method with the twenty-three selected features, the prediction model yields an accuracy rate of 88.1% for the jackknife test and 87.5% for an independent set test, with increased accuracy over the original dataset by 8.5% and 10.42%, respectively. It is expected that our feature selection scheme can be referred to as a useful assistant technique for finding effective competitive inhibitors of GalNAc-transferase. An online predictor based on this research is available at .
Biochemical and Biophysical Research Communications, 2009
Protein-protein interactions (PPIs) play a crucial role in various biological processes. To bette... more Protein-protein interactions (PPIs) play a crucial role in various biological processes. To better comprehend the pathogenesis and treatments of various diseases, it is necessary to learn the detail of these interactions. However, the current experimental method still has many false-positive and false-negative problems. Computational prediction of protein-protein interaction has become a more important prediction method which can overcome the obstacles of the experimental method. In this work, we proposed a novel computational domain-based method for PPI prediction, and an SVM model for the prediction was built based on the physicochemical property of the domain. e outcomes of SVM and the domain-domain score were used to construct the prediction model for protein-protein interaction. e predicted results demonstrated the domain-based research can enhance the ability to predict protein interactions.