Prediction of S-sulfenylation sites using mRMR feature selection and fuzzy support vector machine algorithm - PubMed (original) (raw)

Prediction of S-sulfenylation sites using mRMR feature selection and fuzzy support vector machine algorithm

Zhe Ju et al. J Theor Biol. 2018.

Abstract

Cysteine S-sulfenylation is an important protein post-translational modification, which plays a crucial role in transcriptional regulation, cell signaling, and protein functions. To better elucidate the molecular mechanism of S-sulfenylation, it is important to identify S-sulfenylated substrates and their corresponding S-sulfenylation sites accurately. In this study, a novel bioinformatics tool named Sulf_FSVM is proposed to predict S-sulfenylation sites by using multiple feature extraction and fuzzy support vector machine algorithm. On the one hand, amino acid factors, binary encoding, and the composition of k-spaced amino acid pairs features are incorporated to encode S-sulfenylation sites. And the maximum relevance minimum redundancy method are adopted to remove the redundant features. On the other hand, a fuzzy support vector machine algorithm is used to handle the class imbalance and noise problem in S-sulfenylation sites training dataset. As illustrated by 10-fold cross-validation, the performance of Sulf_FSVM achieves a satisfactory performance with a Sensitivity of 73.26%, a Specificity of 70.78%, an Accuracy of 71.07% and a Matthew's correlation coefficient of 0.2971. Independent tests also show that Sulf_FSVM significantly outperforms existing S-sulfenylation sites predictors. Therefore, Sulf_FSVM can be a useful tool for accurate prediction of protein S-sulfenylation sites.

Keywords: Cross-validation; Feature extraction; K-spaced amino acid pairs; Post-translational modification; SVM.

Copyright © 2018 Elsevier Ltd. All rights reserved.

PubMed Disclaimer

Similar articles

Cited by

Publication types

MeSH terms

Substances

LinkOut - more resources