Incorporating distant sequence features and radial basis function networks to identify ubiquitin conjugation sites - PubMed (original) (raw)
Incorporating distant sequence features and radial basis function networks to identify ubiquitin conjugation sites
Tzong-Yi Lee et al. PLoS One. 2011.
Abstract
Ubiquitin (Ub) is a small protein that consists of 76 amino acids about 8.5 kDa. In ubiquitin conjugation, the ubiquitin is majorly conjugated on the lysine residue of protein by Ub-ligating (E3) enzymes. Three major enzymes participate in ubiquitin conjugation. They are E1, E2 and E3 which are responsible for activating, conjugating and ligating ubiquitin, respectively. Ubiquitin conjugation in eukaryotes is an important mechanism of the proteasome-mediated degradation of a protein and regulating the activity of transcription factors. Motivated by the importance of ubiquitin conjugation in biological processes, this investigation develops a method, UbSite, which uses utilizes an efficient radial basis function (RBF) network to identify protein ubiquitin conjugation (ubiquitylation) sites. This work not only investigates the amino acid composition but also the structural characteristics, physicochemical properties, and evolutionary information of amino acids around ubiquitylation (Ub) sites. With reference to the pathway of ubiquitin conjugation, the substrate sites for E3 recognition, which are distant from ubiquitylation sites, are investigated. The measurement of F-score in a large window size (-20∼+20) revealed a statistically significant amino acid composition and position-specific scoring matrix (evolutionary information), which are mainly located distant from Ub sites. The distant information can be used effectively to differentiate Ub sites from non-Ub sites. As determined by five-fold cross-validation, the model that was trained using the combination of amino acid composition and evolutionary information performs best in identifying ubiquitin conjugation sites. The prediction sensitivity, specificity, and accuracy are 65.5%, 74.8%, and 74.5%, respectively. Although the amino acid sequences around the ubiquitin conjugation sites do not contain conserved motifs, the cross-validation result indicates that the integration of distant sequence features of Ub sites can improve predictive performance. Additionally, the independent test demonstrates that the proposed method can outperform other ubiquitylation prediction tools.
Conflict of interest statement
Competing Interests: The authors have declared that no competing interests exist.
Figures
Figure 1. The analytic flowchart of UbSite.
Figure 2. The detailed process of generating position specific scoring matrix (PSSM) and encoding the fragment of amino acid sequence by generated PSSM.
Figure 3. The position-specific amino acid composition, accessible surface area and secondary structure of ubiquitin conjugated lysines and non-ubiquitin conjugated lysines.
Figure 4. The hypothetic model of identifying the distant sequence features for E3 recognition.
Figure 5. The statistically significant composition of amino acids for each position in the window length from −20 to +20.
Based on the measurement of F-score, the positions −16, −10, −3, −1, +1, +5, +10, +13, and +17, containing higher value of F-score, are significant for differentiating the ubiquitylation sites from non-ubiquitylation sites.
Figure 6. The statistically significant evolutionary information of amino acids for each position in the window length from −20 to +20.
Based on the measurement of F-score, the positions −19, −17, −15, −12, −10, −4, −1, +5, +9, +13, +15 and +18, containing higher value of F-score, are significant for differentiating the ubiquitylation sites from non-ubiquitylation sites.
Figure 7. The predictive performance of the models trained with different window length varying from 11-mer to 41-mer.
Similar articles
- UbiSite: incorporating two-layered machine learning method with substrate motifs to predict ubiquitin-conjugation site on lysines.
Huang CH, Su MG, Kao HJ, Jhong JH, Weng SL, Lee TY. Huang CH, et al. BMC Syst Biol. 2016 Jan 11;10 Suppl 1(Suppl 1):6. doi: 10.1186/s12918-015-0246-z. BMC Syst Biol. 2016. PMID: 26818456 Free PMC article. - Characterization and identification of ubiquitin conjugation sites with E3 ligase recognition specificities.
Nguyen VN, Huang KY, Huang CH, Chang TH, Bretaña N, Lai K, Weng J, Lee TY. Nguyen VN, et al. BMC Bioinformatics. 2015;16 Suppl 1(Suppl 1):S1. doi: 10.1186/1471-2105-16-S1-S1. Epub 2015 Jan 21. BMC Bioinformatics. 2015. PMID: 25707307 Free PMC article. - Proteome-wide identification of ubiquitylation sites by conjugation of engineered lysine-less ubiquitin.
Oshikawa K, Matsumoto M, Oyamada K, Nakayama KI. Oshikawa K, et al. J Proteome Res. 2012 Feb 3;11(2):796-807. doi: 10.1021/pr200668y. Epub 2011 Nov 23. J Proteome Res. 2012. PMID: 22053931 - Structural basis of generic versus specific E2-RING E3 interactions in protein ubiquitination.
Gundogdu M, Walden H. Gundogdu M, et al. Protein Sci. 2019 Oct;28(10):1758-1770. doi: 10.1002/pro.3690. Epub 2019 Aug 23. Protein Sci. 2019. PMID: 31340062 Free PMC article. Review. - The exploitation of host autophagy and ubiquitin machinery by Mycobacterium tuberculosis in shaping immune responses and host defense during infection.
Shariq M, Quadir N, Alam A, Zarin S, Sheikh JA, Sharma N, Samal J, Ahmad U, Kumari I, Hasnain SE, Ehtesham NZ. Shariq M, et al. Autophagy. 2023 Jan;19(1):3-23. doi: 10.1080/15548627.2021.2021495. Epub 2022 Jan 9. Autophagy. 2023. PMID: 35000542 Free PMC article. Review.
Cited by
- PPSNO: A Feature-Rich SNO Sites Predictor by Stacking Ensemble Strategy from Protein Sequence-Derived Information.
Zhu L, Wang L, Yang Z, Xu P, Yang S. Zhu L, et al. Interdiscip Sci. 2024 Mar;16(1):192-217. doi: 10.1007/s12539-023-00595-7. Epub 2024 Jan 11. Interdiscip Sci. 2024. PMID: 38206557 - Precise Prediction of Calpain Cleavage Sites and Their Aberrance Caused by Mutations in Cancer.
Liu ZX, Yu K, Dong J, Zhao L, Liu Z, Zhang Q, Li S, Du Y, Cheng H. Liu ZX, et al. Front Genet. 2019 Aug 8;10:715. doi: 10.3389/fgene.2019.00715. eCollection 2019. Front Genet. 2019. PMID: 31440276 Free PMC article. - ProSol-multi: Protein solubility prediction via amino acids multi-level correlation and discriminative distribution.
Ghafoor H, Asim MN, Ibrahim MA, Dengel A. Ghafoor H, et al. Heliyon. 2024 Aug 22;10(17):e36041. doi: 10.1016/j.heliyon.2024.e36041. eCollection 2024 Sep 15. Heliyon. 2024. PMID: 39281576 Free PMC article. - Multi-dimensional feature recognition model based on capsule network for ubiquitination site prediction.
Li W, Wang J, Luo Y, Bezabih TT. Li W, et al. PeerJ. 2022 Dec 6;10:e14427. doi: 10.7717/peerj.14427. eCollection 2022. PeerJ. 2022. PMID: 36523471 Free PMC article. - Anti-Cancer Peptides: Status and Future Prospects.
Ghaly G, Tallima H, Dabbish E, Badr ElDin N, Abd El-Rahman MK, Ibrahim MAA, Shoeib T. Ghaly G, et al. Molecules. 2023 Jan 23;28(3):1148. doi: 10.3390/molecules28031148. Molecules. 2023. PMID: 36770815 Free PMC article. Review.
References
- Hershko A, Ciechanover A. The ubiquitin system. Annu Rev Biochem. 1998;67:425–479. - PubMed
- Ou CY, Pi H, Chien CT. Control of protein degradation by E3 ubiquitin ligases in Drosophila eye development. Trends Genet. 2003;19:382–389. - PubMed
- Hicke L, Schubert HL, Hill CP. Ubiquitin-binding domains. Nat Rev Mol Cell Biol. 2005;6:610–621. - PubMed
- Peng J, Schwartz D, Elias JE, Thoreen CC, Cheng D, et al. A proteomics approach to understanding protein ubiquitination. Nat Biotechnol. 2003;21:921–926. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials