Incorporating distant sequence features and radial basis function networks to identify ubiquitin conjugation sites - PubMed (original) (raw)

Incorporating distant sequence features and radial basis function networks to identify ubiquitin conjugation sites

Tzong-Yi Lee et al. PLoS One. 2011.

Abstract

Ubiquitin (Ub) is a small protein that consists of 76 amino acids about 8.5 kDa. In ubiquitin conjugation, the ubiquitin is majorly conjugated on the lysine residue of protein by Ub-ligating (E3) enzymes. Three major enzymes participate in ubiquitin conjugation. They are E1, E2 and E3 which are responsible for activating, conjugating and ligating ubiquitin, respectively. Ubiquitin conjugation in eukaryotes is an important mechanism of the proteasome-mediated degradation of a protein and regulating the activity of transcription factors. Motivated by the importance of ubiquitin conjugation in biological processes, this investigation develops a method, UbSite, which uses utilizes an efficient radial basis function (RBF) network to identify protein ubiquitin conjugation (ubiquitylation) sites. This work not only investigates the amino acid composition but also the structural characteristics, physicochemical properties, and evolutionary information of amino acids around ubiquitylation (Ub) sites. With reference to the pathway of ubiquitin conjugation, the substrate sites for E3 recognition, which are distant from ubiquitylation sites, are investigated. The measurement of F-score in a large window size (-20∼+20) revealed a statistically significant amino acid composition and position-specific scoring matrix (evolutionary information), which are mainly located distant from Ub sites. The distant information can be used effectively to differentiate Ub sites from non-Ub sites. As determined by five-fold cross-validation, the model that was trained using the combination of amino acid composition and evolutionary information performs best in identifying ubiquitin conjugation sites. The prediction sensitivity, specificity, and accuracy are 65.5%, 74.8%, and 74.5%, respectively. Although the amino acid sequences around the ubiquitin conjugation sites do not contain conserved motifs, the cross-validation result indicates that the integration of distant sequence features of Ub sites can improve predictive performance. Additionally, the independent test demonstrates that the proposed method can outperform other ubiquitylation prediction tools.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1

Figure 1. The analytic flowchart of UbSite.

Figure 2

Figure 2. The detailed process of generating position specific scoring matrix (PSSM) and encoding the fragment of amino acid sequence by generated PSSM.

Figure 3

Figure 3. The position-specific amino acid composition, accessible surface area and secondary structure of ubiquitin conjugated lysines and non-ubiquitin conjugated lysines.

Figure 4

Figure 4. The hypothetic model of identifying the distant sequence features for E3 recognition.

Figure 5

Figure 5. The statistically significant composition of amino acids for each position in the window length from −20 to +20.

Based on the measurement of F-score, the positions −16, −10, −3, −1, +1, +5, +10, +13, and +17, containing higher value of F-score, are significant for differentiating the ubiquitylation sites from non-ubiquitylation sites.

Figure 6

Figure 6. The statistically significant evolutionary information of amino acids for each position in the window length from −20 to +20.

Based on the measurement of F-score, the positions −19, −17, −15, −12, −10, −4, −1, +5, +9, +13, +15 and +18, containing higher value of F-score, are significant for differentiating the ubiquitylation sites from non-ubiquitylation sites.

Figure 7

Figure 7. The predictive performance of the models trained with different window length varying from 11-mer to 41-mer.

Similar articles

Cited by

References

    1. Hershko A, Ciechanover A. The ubiquitin system. Annu Rev Biochem. 1998;67:425–479. - PubMed
    1. Ou CY, Pi H, Chien CT. Control of protein degradation by E3 ubiquitin ligases in Drosophila eye development. Trends Genet. 2003;19:382–389. - PubMed
    1. Hicke L, Schubert HL, Hill CP. Ubiquitin-binding domains. Nat Rev Mol Cell Biol. 2005;6:610–621. - PubMed
    1. Gilon T, Chomsky O, Kulka RG. Degradation signals for ubiquitin system proteolysis in Saccharomyces cerevisiae. EMBO J. 1998;17:2759–2766. - PMC - PubMed
    1. Peng J, Schwartz D, Elias JE, Thoreen CC, Cheng D, et al. A proteomics approach to understanding protein ubiquitination. Nat Biotechnol. 2003;21:921–926. - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources