Complete genome sequence of the probiotic lactic acid bacterium Lactobacillus acidophilus NCFM - PubMed (original) (raw)

. 2005 Mar 15;102(11):3906-12.

doi: 10.1073/pnas.0409188102. Epub 2005 Jan 25.

W Michael Russell, M Andrea Azcarate-Peril, Rodolphe Barrangou, B Logan Buck, Olivia McAuliffe, Nicole Souther, Alleson Dobson, Tri Duong, Michael Callanan, Sonja Lick, Alice Hamrick, Raul Cano, Todd R Klaenhammer

Affiliations

Complete genome sequence of the probiotic lactic acid bacterium Lactobacillus acidophilus NCFM

Eric Altermann et al. Proc Natl Acad Sci U S A. 2005.

Abstract

Lactobacillus acidophilus NCFM is a probiotic bacterium that has been produced commercially since 1972. The complete genome is 1,993,564 nt and devoid of plasmids. The average GC content is 34.71% with 1,864 predicted ORFs, of which 72.5% were functionally classified. Nine phage-related integrases were predicted, but no complete prophages were found. However, three unique regions designated as potential autonomous units (PAUs) were identified. These units resemble a unique structure and bear characteristics of both plasmids and phages. Analysis of the three PAUs revealed the presence of two R/M systems and a prophage maintenance system killer protein. A spacers interspersed direct repeat locus containing 32 nearly perfect 29-bp repeats was discovered and may provide a unique molecular signature for this organism. In silico analyses predicted 17 transposase genes and a chromosomal locus for lactacin B, a class II bacteriocin. Several mucus- and fibronectin-binding proteins, implicated in adhesion to human intestinal cells, were also identified. Gene clusters for transport of a diverse group of carbohydrates, including fructooligosaccharides and raffinose, were present and often accompanied by transcriptional regulators of the lacI family. For protein degradation and peptide utilization, the organism encoded 20 putative peptidases, homologs for PrtP and PrtM, and two complete oligopeptide transport systems. Nine two-component regulatory systems were predicted, some associated with determinants implicated in bacteriocin production and acid tolerance. Collectively, these features within the genome sequence of L. acidophilus are likely to contribute to the organisms' gastric survival and promote interactions with the intestinal mucosa and microbiota.

PubMed Disclaimer

Figures

Fig. 1.

Fig. 1.

Genome atlas of L. acidophilus NCFM. The atlas represents a circular view of the complete genome sequence of L. acidophilus NCFM. The key describes the single circles in the top-down outermost-innermost direction. The circle was created by using GENEWIZ (16) and in-house-developed software. Innermost circle 1 shows GC-skew. Circle 2 shows COG classification. Predicted ORFs were analyzed by using the COG database and grouped into the four major categories: 1, information storage and processing; 2, cellular processes and signaling; 3, metabolism; 4, poorly characterized; and 5, ORFs with uncharacterized COGs or no COG assignment. Circle 3 shows ORF orientation. ORFs in sense orientation (ORF+) are shown in blue; ORFs oriented in antisense direction (ORF–) are shown in red. Circle 4 shows BLAST similarities. Deduced amino acid sequences compared against the nonredundant (nr) database by using gapped BLASTP (12). Regions in blue represent unique proteins in NCFM, whereas highly conserved features are shown in red. The degree of color saturation corresponds to the level of similarity. Circle 5 shows G+C content deviation. Deviations from the average GC-content are shown in either green (low GC spike) or orange (high GC spike). A box filter was applied to visualize contiguous regions of low or high deviations. Circle 6 shows ribosomal machinery. tRNAs, rRNAs, and ribosomal proteins are shown as green, cyan, or red lines, respectively. Clusters are represented as colored boxes to maintain readability. Circle 7 shows mobile elements. Predicted transposases are shown as light purple, and phage-related integrases are shown as orange dots. Circle 8 shows eptide and amino acid utilization. Proteases and peptidases are shown in green, and nonsugar related transporters are shown in light blue dots. Outermost circle 9 shows two-component regulators (2CRS). Each 2CRS is represented as brown dots, consisting of a response regulator and a histidine kinase. In circles 7–9, each full dot represents one predicted ORF, and clusters of ORFs are represented by stacked dots. Selected features representing single ORFs and ORF clusters are shown outside of circle 9 with bars indicating their absolute size. Origin and terminus of DNA replication are identified in green and red, respectively. Other features are: SlpA and -B (S-layer proteins), CdpA (Cell division protein; ref. 50), sugar utilization (sucrose, FOS, trehalose, raffinose), LacE (PTS-sugar transporter), BshA and -B (bile salt hydrolases), Mub-909 to Mub-1709 (mucus-binding proteins, numbers correspond to the La-number scheme), FbpA (fibronectin binding protein), Cfa (cyclopropane fatty acid synthase), Fibronectin_binding (fibronectin-binding protein cluster), EPS_cluster (exopolysaccharides), Lactacin_B (bacteriocin), pauLA-I to pauLA-III (potential autonomous units), and prLA-I and prLA-II (phage remnants).

Fig. 2.

Fig. 2.

The SPIDR regions and the embedded sequence repeats were visualized by using REPUTER (15). The genome is represented by the green and purple horizontal lines. Repeats are shown by colored boxes on the genome, and the vertical lines indicate similar repeats. Sequence repeat lengths, in bp, are displayed by color coding as indicated by the colored boxes. The sequence 5′-AGGATCACCTCCACTTTCGTGGAGAAAAT-3′ was repeated 32 times in this 1,953-bp region.

Fig. 3.

Fig. 3.

PAUs. The genome is represented by a black line, and base-pair intervals are shown in bp. The three PAUs were aligned at the 5′ end of the integrase, which is shown as a black arrow, and are drawn in scale. The potential replication protein RepA is represented by white-dotted arrows, and the DNA segregation ATPase FtsK is represented by densely black-dotted arrows. ORFs with no predicted functions that are assumed to be part of a PAU are shown as sparsely black-dotted arrows. Checker-boarded ORFs indicate proteins potentially involved in unit stabilization. Adjacent ORFs are shown as open arrows. tRNAs are drawn as dark gray boxes on the genome line.

Similar articles

Cited by

References

    1. Reid, G., Sanders, M. E., Gaskins, H. R., Gibson, G. R., Mercenier, A., Rastall, R., Roberfroid, M., Rowland, I., Cherbut, C. & Klaenhammer, T. R. (2003) J. Clin. Gastroenterol. 37, 105–118. - PubMed
    1. Klaenhammer, T. R. & Russell, W. M. (2000) in Encyclopedia of Food Microbiology (Academic, Amsterdam), Vol. 2, pp. 1151–1157.
    1. Johnson, J. L., Phelps, C. F., Cummins, C. S., London, J. & Gasser, F. (1980) Int. J. Syst. Bacteriol. 30, 53–68.
    1. Heilig, H. G., Zoetendal, E. G., Vaughan, E. E., Marteau, P., Akkermans, A. D. & De Vos, W. M. (2002) Appl. Environ. Microbiol. 68, 114–123. - PMC - PubMed
    1. Schell, M. A., Karmirantzou, M., Snel, B., Vilanova, D., Berger, B., Pessi, G., Zwahlen, M. C., Desiere, F., Bork, P., Delley, M., et al. (2002) Proc. Natl. Acad. Sci. USA 99, 14422–14427. - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources