Prediction of lipoprotein signal peptides in Gram-negative bacteria - PubMed (original) (raw)

Prediction of lipoprotein signal peptides in Gram-negative bacteria

Agnieszka S Juncker et al. Protein Sci. 2003 Aug.

Abstract

A method to predict lipoprotein signal peptides in Gram-negative Eubacteria, LipoP, has been developed. The hidden Markov model (HMM) was able to distinguish between lipoproteins (SPaseII-cleaved proteins), SPaseI-cleaved proteins, cytoplasmic proteins, and transmembrane proteins. This predictor was able to predict 96.8% of the lipoproteins correctly with only 0.3% false positives in a set of SPaseI-cleaved, cytoplasmic, and transmembrane proteins. The results obtained were significantly better than those of previously developed methods. Even though Gram-positive lipoprotein signal peptides differ from Gram-negatives, the HMM was able to identify 92.9% of the lipoproteins included in a Gram-positive test set. A genome search was carried out for 12 Gram-negative genomes and one Gram-positive genome. The results for Escherichia coli K12 were compared with new experimental data, and the predictions by the HMM agree well with the experimentally verified lipoproteins. A neural network-based predictor was developed for comparison, and it gave very similar results. LipoP is available as a Web server at www.cbs.dtu.dk/services/LipoP/.

PubMed Disclaimer

Figures

Figure 1.

Figure 1.

Biosynthesis of a lipoprotein. Lipids are attached to cysteine. Peptides are shown to the left and to the right of the cysteine residue. Catalytic enzymes are written beside reaction arrows.

Figure 2.

Figure 2.

Length distribution for lipoprotein signal peptides and for SPaseI-cleaved signal peptides.

Figure 3.

Figure 3.

Sequence logos of cleavage sites for SPaseI-cleaved proteins (A)and lipoproteins (B) aligned at the cleavage sites (cleavage is between positions −1 and 1). Sequence logos of the 30 N-terminal residues for SPaseI-cleaved protein precursors (C) and lipoprotein precursors (D). A logo displays the amino acid conservation at each position as the information content measured in bits (Schneider and Stephens 1990). Black indicates hydrophobic amino acid (AA); green, neutral/polar AA; blue, positive AA; and red, negative AA.

Figure 4.

Figure 4.

Correlation coefficient as a function of window size and number of hidden neurons.

Figure 5.

Figure 5.

Correlation coefficient and fraction of true positives and true negatives as a function of the threshold.

Figure 6.

Figure 6.

The architecture of the SPaseI and SPaseII models. N-states model the n-region; H-states model the h-region; C-states and A-states model the regions before and after the cleavage site, respectively; and M-states model the remaining residues. All N-states except N1 are tied, all H-states are tied, states C7–C9 are tied, and all M-states are tied. Dashed transitions and light gray states are present only in the model of SpaseI-cleaved signal peptides, and dotted transitions and dark gray states are present only in the model of lipoproteins.

Figure 7.

Figure 7.

HMM performance as a function of score difference. (Top) The fraction of correct predictions as a function of score difference. (Bottom) The fraction of sequences wrongly predicted as signal peptides or lipoproteins.

Figure 8.

Figure 8.

Histogram of cleavage site prediction errors.

Similar articles

Cited by

References

    1. Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. 1997. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 25 3389–3402. - PMC - PubMed
    1. Bairoch, A. and Apweiler, R. 2000. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res. 28 45–48. - PMC - PubMed
    1. Bengtsson, J., Tjalsma, H., Rivolta, C., and Hederstedt, L. 1999. Subunit II of Bacillus subtilis cytochrome c oxidase is a lipoprotein. J. Bacteriol. 181 685–688. - PMC - PubMed
    1. Benson, D.A., Karsch-Mizrachi, I., Lipman, D.J., Ostell, J., Rapp, B.A., and Wheeler, D.L. 2002. GenBank. Nucleic Acids Res. 30 17–20. - PMC - PubMed
    1. Braun, V. and Wu, H.C. 1994. Lipoproteins, structure, function, biosynthesis, and a model for protein export. In Bacterial cell wall (eds. J.M. Ghuysen and R. Hakenbeck), pp. 319–341. Elsevier, Amsterdam, The Netherlands.

Publication types

MeSH terms

Substances

LinkOut - more resources