Gibbs motif sampling: detection of bacterial outer membrane protein repeats (original) (raw)

Abstract

The detection and alignment of locally conserved regions (motifs) in multiple sequences can provide insight into protein structure, function, and evolution. A new Gibbs sampling algorithm is described that detects motif-encoding regions in sequences and optimally partitions them into distinct motif models; this is illustrated using a set of immunoglobulin fold proteins. When applied to sequences sharing a single motif, the sampler can be used to classify motif regions into related submodels, as is illustrated using helix-turn-helix DNA-binding proteins. Other statistically based procedures are described for searching a database for sequences matching motifs found by the sampler. When applied to a set of 32 very distantly related bacterial integral outer membrane proteins, the sampler revealed that they share a subtle, repetitive motif. Although BLAST (Altschul SF et al., 1990, J Mol Biol 215:403-410) fails to detect significant pairwise similarity between any of the sequences, the repeats present in these outer membrane proteins, taken as a whole, are highly significant (based on a generally applicable statistical test for motifs described here). Analysis of bacterial porins with known trimeric beta-barrel structure and related proteins reveals a similar repetitive motif corresponding to alternating membrane-spanning beta-strands. These beta-strands occur on the membrane interface (as opposed to the trimeric interface) of the beta-barrel. The broad conservation and structural location of these repeats suggests that they play important functional roles.

Full Text

The Full Text of this article is available as a PDF (6.3 MB).

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Altschul S. F., Gish W., Miller W., Myers E. W., Lipman D. J. Basic local alignment search tool. J Mol Biol. 1990 Oct 5;215(3):403–410. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
  2. Bairoch A., Boeckmann B. The SWISS-PROT protein sequence data bank. Nucleic Acids Res. 1992 May 11;20 (Suppl):2019–2022. doi: 10.1093/nar/20.suppl.2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Baldi P., Chauvin Y., Hunkapiller T., McClure M. A. Hidden Markov models of biological primary sequence information. Proc Natl Acad Sci U S A. 1994 Feb 1;91(3):1059–1063. doi: 10.1073/pnas.91.3.1059. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Barker W. C., George D. G., Mewes H. W., Pfeiffer F., Tsugita A. The PIR-International databases. Nucleic Acids Res. 1993 Jul 1;21(13):3089–3092. doi: 10.1093/nar/21.13.3089. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bennett P. B., Jr, Makita N., George A. L., Jr A molecular basis for gating mode transitions in human skeletal muscle Na+ channels. FEBS Lett. 1993 Jul 12;326(1-3):21–24. doi: 10.1016/0014-5793(93)81752-l. [DOI] [PubMed] [Google Scholar]
  6. Benson D., Lipman D. J., Ostell J. GenBank. Nucleic Acids Res. 1993 Jul 1;21(13):2963–2965. doi: 10.1093/nar/21.13.2963. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Bork P., Holm L., Sander C. The immunoglobulin fold. Structural classification, sequence patterns and common core. J Mol Biol. 1994 Sep 30;242(4):309–320. doi: 10.1006/jmbi.1994.1582. [DOI] [PubMed] [Google Scholar]
  8. Bosch D., Scholten M., Verhagen C., Tommassen J. The role of the carboxy-terminal membrane-spanning fragment in the biogenesis of Escherichia coli K12 outer membrane protein PhoE. Mol Gen Genet. 1989 Mar;216(1):144–148. doi: 10.1007/BF00332243. [DOI] [PubMed] [Google Scholar]
  9. Brennan R. G., Matthews B. W. The helix-turn-helix DNA binding motif. J Biol Chem. 1989 Feb 5;264(4):1903–1906. [PubMed] [Google Scholar]
  10. Cowan S. W., Schirmer T., Rummel G., Steiert M., Ghosh R., Pauptit R. A., Jansonius J. N., Rosenbusch J. P. Crystal structures explain functional properties of two E. coli porins. Nature. 1992 Aug 27;358(6389):727–733. doi: 10.1038/358727a0. [DOI] [PubMed] [Google Scholar]
  11. Gallegos M. T., Michán C., Ramos J. L. The XylS/AraC family of regulators. Nucleic Acids Res. 1993 Feb 25;21(4):807–810. doi: 10.1093/nar/21.4.807. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Gribskov M., Lüthy R., Eisenberg D. Profile analysis. Methods Enzymol. 1990;183:146–159. doi: 10.1016/0076-6879(90)83011-w. [DOI] [PubMed] [Google Scholar]
  13. Gribskov M., McLachlan A. D., Eisenberg D. Profile analysis: detection of distantly related proteins. Proc Natl Acad Sci U S A. 1987 Jul;84(13):4355–4358. doi: 10.1073/pnas.84.13.4355. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Harpaz Y., Chothia C. Many of the immunoglobulin superfamily domains in cell adhesion molecules and surface receptors belong to a new structural set which is close to that containing variable domains. J Mol Biol. 1994 May 13;238(4):528–539. doi: 10.1006/jmbi.1994.1312. [DOI] [PubMed] [Google Scholar]
  15. Henikoff S., Henikoff J. G. Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci U S A. 1992 Nov 15;89(22):10915–10919. doi: 10.1073/pnas.89.22.10915. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Henikoff S., Henikoff J. G. Automated assembly of protein blocks for database searching. Nucleic Acids Res. 1991 Dec 11;19(23):6565–6572. doi: 10.1093/nar/19.23.6565. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Henikoff S., Henikoff J. G. Protein family classification based on searching a database of blocks. Genomics. 1994 Jan 1;19(1):97–107. doi: 10.1006/geno.1994.1018. [DOI] [PubMed] [Google Scholar]
  18. Hunkapiller T., Hood L. The growing immunoglobulin gene superfamily. Nature. 1986 Sep 4;323(6083):15–16. doi: 10.1038/323015a0. [DOI] [PubMed] [Google Scholar]
  19. Jap B. K., Walian P. J., Gehring K. Structural architecture of an outer membrane channel as determined by electron crystallography. Nature. 1991 Mar 14;350(6314):167–170. doi: 10.1038/350167a0. [DOI] [PubMed] [Google Scholar]
  20. Jeanteur D., Lakey J. H., Pattus F. The bacterial porin superfamily: sequence alignment and structure prediction. Mol Microbiol. 1991 Sep;5(9):2153–2164. doi: 10.1111/j.1365-2958.1991.tb02145.x. [DOI] [PubMed] [Google Scholar]
  21. Jin S., Sonenshein A. L. Identification of two distinct Bacillus subtilis citrate synthase genes. J Bacteriol. 1994 Aug;176(15):4669–4679. doi: 10.1128/jb.176.15.4669-4679.1994. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Kaufmann A., Stierhof Y. D., Henning U. New outer membrane-associated protease of Escherichia coli K-12. J Bacteriol. 1994 Jan;176(2):359–367. doi: 10.1128/jb.176.2.359-367.1994. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Kreusch A., Neubüser A., Schiltz E., Weckesser J., Schulz G. E. Structure of the membrane channel porin from Rhodopseudomonas blastica at 2.0 A resolution. Protein Sci. 1994 Jan;3(1):58–63. doi: 10.1002/pro.5560030108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Krogh A., Brown M., Mian I. S., Sjölander K., Haussler D. Hidden Markov models in computational biology. Applications to protein modeling. J Mol Biol. 1994 Feb 4;235(5):1501–1531. doi: 10.1006/jmbi.1994.1104. [DOI] [PubMed] [Google Scholar]
  25. Lawrence C. E., Altschul S. F., Boguski M. S., Liu J. S., Neuwald A. F., Wootton J. C. Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science. 1993 Oct 8;262(5131):208–214. doi: 10.1126/science.8211139. [DOI] [PubMed] [Google Scholar]
  26. Mackett M., Conway M. J., Arrand J. R., Haddad R. S., Hutt-Fletcher L. M. Characterization and expression of a glycoprotein encoded by the Epstein-Barr virus BamHI I fragment. J Virol. 1990 Jun;64(6):2545–2552. doi: 10.1128/jvi.64.6.2545-2552.1990. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Morona R., Klose M., Henning U. Escherichia coli K-12 outer membrane protein (OmpA) as a bacteriophage receptor: analysis of mutant genes expressing altered proteins. J Bacteriol. 1984 Aug;159(2):570–578. doi: 10.1128/jb.159.2.570-578.1984. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Needleman S. B., Wunsch C. D. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol. 1970 Mar;48(3):443–453. doi: 10.1016/0022-2836(70)90057-4. [DOI] [PubMed] [Google Scholar]
  29. Neuwald A. F., Green P. Detecting patterns in protein sequences. J Mol Biol. 1994 Jun 24;239(5):698–712. doi: 10.1006/jmbi.1994.1407. [DOI] [PubMed] [Google Scholar]
  30. Nikaido H. Porins and specific channels of bacterial outer membranes. Mol Microbiol. 1992 Feb;6(4):435–442. doi: 10.1111/j.1365-2958.1992.tb01487.x. [DOI] [PubMed] [Google Scholar]
  31. Nikaido H. Porins and specific diffusion channels in bacterial outer membranes. J Biol Chem. 1994 Feb 11;269(6):3905–3908. [PubMed] [Google Scholar]
  32. Pohlner J., Halter R., Beyreuther K., Meyer T. F. Gene structure and extracellular secretion of Neisseria gonorrhoeae IgA protease. 1987 Jan 29-Feb 4Nature. 325(6103):458–462. doi: 10.1038/325458a0. [DOI] [PubMed] [Google Scholar]
  33. Schirmer T., Cowan S. W. Prediction of membrane-spanning beta-strands and its application to maltoporin. Protein Sci. 1993 Aug;2(8):1361–1363. doi: 10.1002/pro.5560020820. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Smith T. F., Waterman M. S. Identification of common molecular subsequences. J Mol Biol. 1981 Mar 25;147(1):195–197. doi: 10.1016/0022-2836(81)90087-5. [DOI] [PubMed] [Google Scholar]
  35. Staden R. Methods for calculating the probabilities of finding patterns in sequences. Comput Appl Biosci. 1989 Apr;5(2):89–96. doi: 10.1093/bioinformatics/5.2.89. [DOI] [PubMed] [Google Scholar]
  36. Stout V., Torres-Cabassa A., Maurizi M. R., Gutnick D., Gottesman S. RcsA, an unstable positive regulator of capsular polysaccharide synthesis. J Bacteriol. 1991 Mar;173(5):1738–1747. doi: 10.1128/jb.173.5.1738-1747.1991. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Struyvé M., Moons M., Tommassen J. Carboxy-terminal phenylalanine is essential for the correct assembly of a bacterial outer membrane protein. J Mol Biol. 1991 Mar 5;218(1):141–148. doi: 10.1016/0022-2836(91)90880-f. [DOI] [PubMed] [Google Scholar]
  38. Treisman J., Harris E., Wilson D., Desplan C. The homeodomain: a new face for the helix-turn-helix? Bioessays. 1992 Mar;14(3):145–150. doi: 10.1002/bies.950140302. [DOI] [PubMed] [Google Scholar]
  39. Viale A. M., Kobayashi H., Akazawa T., Henikoff S. rbcR [correction of rcbR], a gene coding for a member of the LysR family of transcriptional regulators, is located upstream of the expressed set of ribulose 1,5-bisphosphate carboxylase/oxygenase genes in the photosynthetic bacterium Chromatium vinosum. J Bacteriol. 1991 Aug;173(16):5224–5229. doi: 10.1128/jb.173.16.5224-5229.1991. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Vogel H., Jähnig F. Models for the structure of outer-membrane proteins of Escherichia coli derived from raman spectroscopy and prediction methods. J Mol Biol. 1986 Jul 20;190(2):191–199. doi: 10.1016/0022-2836(86)90292-5. [DOI] [PubMed] [Google Scholar]
  41. Weickert M. J., Adhya S. A family of bacterial regulators homologous to Gal and Lac repressors. J Biol Chem. 1992 Aug 5;267(22):15869–15874. [PubMed] [Google Scholar]
  42. Weiss M. S., Wacker T., Weckesser J., Welte W., Schulz G. E. The three-dimensional structure of porin from Rhodobacter capsulatus at 3 A resolution. FEBS Lett. 1990 Jul 16;267(2):268–272. doi: 10.1016/0014-5793(90)80942-c. [DOI] [PubMed] [Google Scholar]
  43. Williams A. F., Barclay A. N. The immunoglobulin superfamily--domains for cell surface recognition. Annu Rev Immunol. 1988;6:381–405. doi: 10.1146/annurev.iy.06.040188.002121. [DOI] [PubMed] [Google Scholar]