Comparative genomic and physiological analysis provides insights into the role of Acidobacteria in organic carbon utilization in Arctic tundra soils (original) (raw)

Abstract

Acidobacteria are among the most abundant bacterial phyla found in terrestrial ecosystems, but relatively little is known about their diversity, distribution and most critically, their function. Understanding the functional activities encoded in their genomes will provide insights into their ecological roles. Here we describe the genomes of three novel cold-adapted strains of subdivision 1 Acidobacteria. The genomes consist of a circular chromosome of 6.2 Mbp for Granulicella mallensis MP5ACTX8, 4.3 Mbp for Granulicella tundricola MP5ACTX9, and 5.0 Mbp for Terriglobus saanensis SP1PR4. In addition, G. tundricola has five mega plasmids for a total genome size of 5.5 Mbp. The three genomes showed an abundance of genes assigned to metabolism and transport of carbohydrates. In comparison to three mesophilic Acidobacteria, namely Acidobacterium capsulatum ATCC 51196, ‘Candidatus Koribacter versatilis’ Ellin345, and ‘Candidatus Solibacter usitatus’ Ellin6076, the genomes of the three tundra soil strains contained an abundance of conserved genes/gene clusters encoding for modules of the carbohydrate-active enzyme (CAZyme) family. Furthermore, a large number of glycoside hydrolases and glycosyl transferases were prevalent. We infer that gene content and biochemical mechanisms encoded in the genomes of three Arctic tundra soil Acidobacteria strains are shaped to allow for breakdown, utilization, and biosynthesis of diverse structural and storage polysaccharides and resilience to fluctuating temperatures and nutrient-deficient conditions in Arctic tundra soils.

Introduction

Acidobacteria represent one of the most abundant and ubiquitous bacterial phyla found in global soil environments (Barns et al., 1999; Janssen, 2006; Fierer et al., 2007; Jones et al., 2009; Kielak et al., 2009; Lauber et al., 2009; Lee & Cho, 2009; Eichorst et al., 2011), and they are widely distributed in Arctic and boreal soils (Goulden et al., 1998; Neufeld & Mohn, 2005; Dedysh et al., 2006; Männistö et al., 2007, 2009; Lee et al., 2008; Pankratov et al., 2008, 2011; Campbell et al., 2010; Chu et al., 2010). Nevertheless, relatively little is still known about their functional and ecological roles in these soils. Despite a large collection of Acidobacteria 16S rRNA gene sequences in databases representing diverse species from various habitats, only a few have been cultivated and described. Acidobacteria have been divided in up to 26 phylogenetic subdivisions based on 16S rRNA gene phylogeny (Barns et al., 2007) of which subdivisions 1, 3, 4, and 6 are most abundantly detected in soil environments (Jones et al., 2009). The phylogenetic diversity, ubiquity, and abundance of this group suggest that they play important ecological roles in soils. Acidobacteria are assumed to be genetically and metabolically diverse, as they inhabit a wide variety of natural environments over a range of temperature, salinity, organic matter, and pH (Jones et al., 2009; Faoro et al., 2010; Ganzert et al., 2011). The abundance of Acidobacteria correlates with soil pH, with subgroup 1 Acidobacteria being the most abundant in slightly acidic soils (Kishimoto et al., 1991; Männistö et al., 2007; Kleinsteuber et al., 2008; Jones et al., 2009; Lauber et al., 2009; Chu et al., 2010). An increasing number of Acidobacteria have recently been cultivated and described (for references see Männistö et al., 2012). Nonetheless, the paucity of well-characterized Acidobacteria hampers our understanding of the physiology and ecological function of these organisms, as well as how they will respond and adapt to environmental change in these soil environments.

Arctic and boreal environments cover over 20% of the terrestrial surface and harbor about one-third of the total global soil carbon pool (Loya & Grogan, 2004). However, little is known about the microbial communities that utilize this large carbon pool, their activity, and community dynamics, despite their critical role in carbon mineralization and potential impact on atmospheric CO2 and future climate change. Acidobacteria have been reported to dominate soils rich in soil organic matter and are involved in microbial degradation of lignocellulosic plant biomass (Eichorst et al., 2011; Pankratov et al., 2011). Using combined molecular- and cultivation-based approaches, we have demonstrated that members of subdivision 1 Acidobacteria are a dominant bacterial group that are active at low temperatures and resilient to multiple freeze–thaw cycles in acidic tundra soils of northern Fennoscandia. In addition, Acidobacteria comprise up to > 50% of sequences in clone libraries (Männistö et al., 2007, 2009). A concerted effort led to the cultivation of several new slow-growing and fastidious cold-adapted Acidobacteria belonging to the genera Terriglobus and Granulicella (Männistö et al., 2011, 2012). It appears that soils naturally exposed to harsh and changing environmental conditions may harbor frost-tolerant and resilient bacterial species. We hypothesize that these conditions have selected a stable bacterial community dominated by Acidobacteria that is only minimally affected by temperature fluctuation and freeze–thaw cycles.

Here, we report on the analysis of the genomes of three novel cold-adapted strains of subdivision 1 Acidobacteria, Granulicella mallensis strain MP5ACTX8, Granulicella tundricola strain MP5ACTX9, and Terriglobus saanensis strain SP1PR4, isolated from Arctic tundra soils (Fig. 1). Our genomic analysis of the three tundra soil strains is supported by physiological characterization to assess the mechanisms promoting their activity, dominance, and survival in these soil environments. These strains are compared with three other Acidobacteria strains, for which finished genomes are available (Ward et al., 2009; Challacombe et al., 2011), namely Acidobacterium capsulatum ATCC 51196 isolated from acid mine drainage in Japan (Kishimoto et al., 1991) and two other soil strains, ‘Candidatus Koribacter versatilis’ Ellin345 and ‘Candidatus Solibacter usitatus Ellin6076’, both isolated from soils of rye grass/clover pasture in Australia (Joseph et al., 2003; Davis et al., 2005). Our study provides genomic insights into the ecology of these Acidobacteria communities in turnover of soil organic carbon in Arctic and boreal environments.

Phylogenetic tree based on 16S rRNA gene sequences showing the relationships of the three tundra soil strains and other cultured strains of the phylum Acidobacteria. Bootstrap values (expressed as percentages of 1000 replicates) of > 50% are shown at branch points. The evolutionary history was inferred using the maximum likelihood method based on the Tamura–Nei model from a total of 1167 unambiguously aligned nucleotide positions in the final data set. Two members of the phylum Plantomycetes, Singulisphaera acidiphila ATCC BAA-1392T (AM850678), and Isosphaera pallida strain 563 (AJ231193) were used as outgroup (not shown). Strains with finished genomes are indicated in bold. Accession numbers are in parentheses. Bar: 0.05 substitutions per nucleotide position.

Phylogenetic tree based on 16S rRNA gene sequences showing the relationships of the three tundra soil strains and other cultured strains of the phylum Acidobacteria. Bootstrap values (expressed as percentages of 1000 replicates) of > 50% are shown at branch points. The evolutionary history was inferred using the maximum likelihood method based on the Tamura–Nei model from a total of 1167 unambiguously aligned nucleotide positions in the final data set. Two members of the phylum Plantomycetes, Singulisphaera acidiphila ATCC BAA-1392T (AM850678), and Isosphaera pallida strain 563 (AJ231193) were used as outgroup (not shown). Strains with finished genomes are indicated in bold. Accession numbers are in parentheses. Bar: 0.05 substitutions per nucleotide position.

Materials and methods

Habitat and strains of Acidobacteria

Granulicella mallensis MP5ACTX8T (= DSM 23137 = ATCC BAA-1857), G. tundricola MP5ACTX9T (= DSM 23138 = ATCC BAA-1859), and T. saanensis SP1PR4T (= DSM 23119 = ATCC BAA-1853) were isolated from Arctic tundra heaths located in northern Finland (Männistö et al., 2011, 2012). All strains originated from the organic layer of soil samples collected from oligotrophic wind-swept hills that experience large annual temperature variation and frequent freeze–thaw cycles in the autumn and spring. Vegetation in these sites is dominated by dwarf shrubs of the Ericaceae family, which produce acidic organic matter with a high C/N ratio (Eskelinen et al., 2009). The soil organic matter content is high, c. 30–50% and acidic (pH 4.8–5.2). Strains were cultivated from soil samples using R2A agar (SP1PR4) or a mixture of carboxymethyl cellulose (CMC), starch, and xylan as carbon sources (MP5ACTX8 and MP5ACTX9), and once obtained as pure cultures, the strains were maintained and grown on R2 agar or broth (adjusted to pH 5.5, Difco) and stored at −70 °C in 20% glycerol.

Growth of strains and DNA extraction

The tundra soil strains were grown aerobically on half-strength R2A medium, pH 5.5 at 20 °C. Genomic DNA of high sequencing quality was isolated using a hexadecyltrimethylammonium (CTAB) method (Doyle & Doyle, 1990) modified for genomic DNA extraction from bacterial cells. Cells (OD600 nm of not more than 1.2) were treated with 2% SDS and 250 μg mL−1 proteinase K to lyse the cells and incubated at 37 °C for 1 h. Then, CTAB buffer (1% CTAB, 0.75 M NaCl, 50 mM Tris pH 8, 10 mM EDTA) was added and incubated at 65 °C for 10 min. The suspension was extracted once with chloroform/isoamyl alcohol (24 : 1) and then with phenol/chloroform/isoamyl alcohol (24 : 1). Finally, the DNA was precipitated from the supernatant with 0.6 vol isopropanol (−20 °C) at room temperature for 30 min. Genomic DNA was pelleted, washed with 70% ethanol, and dried. The pellet was resuspended in TE (10 mM Tris, 1 mM EDTA) buffer containing 1 μL RNAse (10 mg mL−1) and incubated at 37 °C for 20 min. The DNA was evaluated according to the quality control guidelines provided by the DOE Joint Genome Institute (DOE-JGI).

Genome sequencing and assembly

Finished genomes for strains G. mallensis MP5ACTX8 (JGI ID 4088692), G. tundricola MP5ACTX9 (JGI ID 4088693), and T. saanensis SP1PR4 (JGI ID 4088690) were generated at DOE Joint Genome Institute using a combination of Illumina (Bennett, 2004) and 454 technologies (Margulies et al., 2005). Three libraries, an Illumina GAii shotgun library, a 454 Titanium standard library, and a paired-end 454 library were constructed. All general aspects of library construction and sequencing performed at JGI can be found at http://www.jgi.doe.gov/. The 454 Titanium standard data and the 454 paired-end data were assembled together with Newbler, version 2.3. The Newbler consensus sequences were computationally shredded into 2 kb overlapping fake reads (shreds). Illumina sequencing data were assembled with velvet, version 0.7.63 (Zerbino & Birney, 2008), and the consensus sequences were computationally shredded into 1.5 kb overlapping fake reads (shreds). The 454 Newbler consensus shreds, the Illumina velvet consensus shreds, and the read pairs were integrated in the 454 paired-end library using parallel phrap, version sps – 4.24 (High Performance Software, LLC). The software Consed (Ewing et al., 1998; Gordon et al., 1998) was used in the following finishing process. Illumina data were used to correct potential base errors and increase consensus quality using the software Polisher developed at JGI (Alla Lapidus, unpublished data). Possible misassemblies were corrected using gapResolution (Cliff Han, unpublished data), Dupfinisher (Han & Chain, 2006), or sequencing cloned bridging PCR fragments with subcloning. Gaps between contigs were closed by editing in Consed, by PCR, and by Bubble PCR (J-F Cheng, unpublished data) primer walks. A total of 291, 153, and 28 additional PCRs, and 5, 6, and 0 shatter libraries were necessary to close gaps and to raise the quality of the finished genomes of G. mallensis MP5ACTX8, G. tundricola MP5ACTX9, and T. saanensis SP1PR4, respectively. A combined depth of coverage of 231×, 294×, and 219× was achieved for the three genomes of G. mallensis MP5ACTX8, G. tundricola MP5ACTX9, and T. saanensis SP1PR4, respectively.

Sequence analysis, annotation, and bioinformatics

Gene prediction and/or functional annotation of genomes were retrieved through the Integrated Microbial Genome (IMG) system supported by DOE-JGI Microbial Annotation Pipeline (DOE-JGI MAP). Genes were identified using Prodigal as part of the Oak Ridge National Laboratory genome annotation pipeline, followed by a round of manual curation using the JGI GenePRIMP pipeline (Pati et al., 2010). The coding sequences (CDS) were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, UniProt, TIGRFam, Pfam, PRIAM, KEGG, COGs, and InterPro databases. These data sources were combined to assert a product description for each predicted protein. Noncoding genes and miscellaneous features were predicted using tRNAscan-SE, RNAMMer, Rfam, TMHMM, and signalP.

Comparative genome analysis was carried out in part using the Integrated Microbial Genome (IMG) portal. The DNA and protein sequences were retrieved by web-based databases (e.g., NCBI Query and blast searches). To facilitate the predictive analysis of the open reading frames of the sequenced genomes, relevant information was extracted from databases such as clusters of orthologous genes (COGs) (http://www.ncbi.nlm.nih.gov/COG) (Tatusov et al., 1997) and CAZy (http://www.cazy.org) (Cantarel et al., 2009). Phylogenetic and molecular evolutionary analyses were conducted for DNA- or protein-based sequences subjected to alignment using clustalw and muscle using mega version 5 (Tamura et al., 2011).

Homology-derived secondary structure of proteins (HSSP) distances (Sander & Schneider, 1991; Rost, 2002) between protein sequences were computed as follows: (1) psi-blast (Altschul et al., 1997) all-against-all the sequences in the six genomes (three iterations, inclusion e-value (h and e) parameters = 10–10); (2) extract sequence identity and alignment length without gaps for each alignment; (3) apply the HSSP formula described in Rost (2002) to compute the distance. In short, the similarity of genome A to genome B is computed as the number of genes in A that have homologues in B to the total number of genes in A (at a given HSSP cutoff). At each HSSP cutoff (range −5 to 45), we computed the numbers of genes overlapping between all different combinations of genomes. This measure represents the distance of a particular sequence alignment from a homology threshold curve, a function of alignment length, and percent sequence identity. In other words, alignments are mapped to a two-dimensional space where points above the HSSP curve represent pairs of functionally similar proteins. The distance of one such point to the curve is correlated with the reliability of function transfer; that is, the percentage of functionally similar sequence pairs at HSSP distance above 30 is higher than at HSSP distance above 0. We mapped the percentage of genes in each genome common to (1) only the three tundra genomes vs. (2) all six genomes. To compute the ‘random’ baseline for this curve, we randomly selected from HAMAP (Lima et al., 2009) six genomes of fully sequenced bacteria with similar genome size (4–5 K genes): Azotobacter vinelandii, Escherichia coli (strain K12), Methylacidiphilum infernorum (isolate V4), Novosphingobium aromaticivorans (strain DSM 12444), Rhodopseudomonas palustris (strain ATCC BAA-98/CGA009), and Vibrio fischeri (strain MJ11). Of these six, we randomly chose E. coli, A. vinelandii, and V. fischeri to be the reference outliers.

Assays for substrate and enzyme activities

Carbon source utilization assays and enzymatic activities were tested and reported earlier (Männistö et al., 2011, 2012), where utilization of sugars was detected by growth on 96-well plates, and hydrolysis of various polysaccharides was assayed as CO2 production (Männistö et al., 2011, 2012). Here, we assayed the utilization of soluble polysaccharides alginate, CMC, laminarin, lichenan, pectin, pullulan, and starch on 96-well plates with VL55 mineral medium (pH 5.5; Sait et al., 2002) supplemented with yeast extract (100 mg L−1) and 1–2 g L−1 of the polysaccharide. CMC hydrolysis was tested for colonies spotted on replicate diagnostic plates with 0.5 and 1% (w/v) of CMC sodium salt as sole source of carbon in VL55 medium containing yeast extract (100 mg L−1) and grown for 2 weeks. Formation of a zone of clearance around colonies after staining with iodine was used as a preliminary indication of enzyme activity (Kasana et al., 2008). Hydrolysis of CMC and xylan was further assayed on plates containing 0.25 g L−1 of peptone and yeast extract, 20 g L−1 of agar, and 2 or 5 g L−1 of each polysaccharide in VL55 buffer (pH 5.5). The plates were incubated at 20 °C for up to 4 weeks, and the polysaccharide hydrolysis was detected by flooding the plates with 0.1% Congo red for 15 min (Teather & Wood, 1982).

Utilization of chitin by the strains was assayed using chitin azure (chitin from crab shells covalently linked with Remazol Brilliant Violet 5R (RBV) dye; Sigma) and by a chitobiase assay as described by O'Brien & Colwell (1987). For the chitin azure assay, 0.5 mL of substrate (0.5 g L−1 in VL55 buffer) was mixed with 0.5 mL of culture grown for 5 days on glucose (1 g L−1) and yeast extract (0.5 g L−1). Chitinase activity was detected after 5 h, 20 h, 42 h, and 10 days incubation at room temperature by measuring the absorbance at 550 nm. Chitobiase activity was assayed by the filter paper spot test using 4-methylumbelliferyl-_N_-acetyl-β-d-glucosaminide. Biomass for the assay was grown on R2A (pH 5.5) for 5 days and ca. 0.2 mL of loop-full of the colonies rubbed on antibiotic disks. Control disks without substrate and without inoculum were included, and 20 μL of the substrate solution pipetted to each disk. Chitobiase activity was detected after 20–30 min of incubation at room temperature by exposure to UV light.

Genome submissions

The finished genomes are submitted to NCBI with the following accession numbers/Taxon IDs: G. mallensis MP5ACTX8 (CP003130/682795), G. tundricola MP5ACTX9 (CP002480/696844), G. tundricola MP5ACTX9 plasmids (CP002481, CP002482, CP002483, CP002484 and CP002485), and T. saanensis SP1PR4 (CP002467/401053).

Results and discussion

General genome features and metadata

The genomes of three tundra soil strains, G. mallensis, G. tundricola, and T. saanensis, consist of one circular chromosome of 6.2, 4.3, and 5.0 Mbp, respectively (Fig. 2). In addition, G. tundricola has five mega plasmids ranging in size from 1.1 × 105 to 4.7 × 105 bp (Fig. S1) for a total genome size of 5.5 Mbp. Among the five strains of subdivision 1 Acidobacteria, G. mallensis has the largest genome size of 6.2 Mbp. The general genome and physiological features of the three tundra soil strains are compared with those of A. capsulatum, ‘_K. versatilis_’, and ‘_S. usitatus_’ and shown in Table 1. The genome GC content for the three tundra strains ranges from 57% to 60%. The genomes of G. mallensis, G. tundricola, and T. saanensis are estimated to encode for 4907, 4706, and 4279 protein CDSs, respectively, with ~70% of the genes with predicted functions (68–72% COGs) (Table 1). The genomes of G. mallensis and T. saanensis consist of a large percent of CDSs with signal peptides (43–44%), suggesting that a higher number of genes are involved in transport/translocation processes in these two tundra soil strains as compared to the other subdivision 1 Acidobacteria. The three tundra soil strains contained a large number of sequences in paralogous clusters (50–54%). In comparison, the large genome size (9.9 Mbp) of ‘_S. usitatus_’ belonging to subdivision 3 Acidobacteria has an increased number of paralogs, which has been accounted for by gene acquisition and horizontal gene transfer events (Challacombe et al., 2011). We identified mobile genetic elements in all three tundra soil strains, with a large number in the genome of G. tundricola (n = 154), encoding for phage integrases, transposases, and IS elements (Table S1). Granulicella tundricola also contained five mega plasmids (> 100 kb in size). We did not find any CDSs for clustered, regularly interspaced, short palindromic repeats (CRISPRs) in the genome of any of the three tundra soils strains; however, a CDS for a CRISPR-associated protein (AciX8_2932) of the Cas5 family was present in the genome of G. mallensis. This suggests that the genomes of the tundra soil strains are subjected to gene transfer events.

Circular representation of the genomes of Granulicella mallensis MP5ACTX8, Granulicella tundricola MP5ACTX9, and Terriglobus saanensis SP1PR4 displaying relevant genome features. The outermost circles (circles 1 and 2) show the forward and reverse strand of protein coding genes (CDSs) colored by COGs functional categories; circle 3 shows RNA genes (tRNAs in green, rRNAs in red, other RNAs in black); circle 4 shows CDSs encoding for CAZyme families; circle 5 shows GC plot (G+C distribution); and circle 6 shows GC skew.

Circular representation of the genomes of Granulicella mallensis MP5ACTX8, Granulicella tundricola MP5ACTX9, and Terriglobus saanensis SP1PR4 displaying relevant genome features. The outermost circles (circles 1 and 2) show the forward and reverse strand of protein coding genes (CDSs) colored by COGs functional categories; circle 3 shows RNA genes (tRNAs in green, rRNAs in red, other RNAs in black); circle 4 shows CDSs encoding for CAZyme families; circle 5 shows GC plot (G+C distribution); and circle 6 shows GC skew.

Comparison of general genome and physiological features of six Acidobacteria species

G. mallensis MP5ACTX8 G. tundricola MP5ACTX9 T. saanensis SP1PR4 A. capsulatum ATCC 51196 ‘_K. versatilis_’ Ellin345 ‘_S. usitatus_’ Ellin6076
Genome data
Genome size base pairs (bp) 6 237 577 5 503 984 5 095 226 4 127 356 5 650 368 9 965 640
DNA coding region (bp) 5 452 093 4 762 045 4 578 206 3 552 031 5 056 593 9 017 234
G+C content (mol%) 57.9 60.0 57.3 60.5 58.4 61.9
Total RNA genes 53 52 54 48 58 63
tRNA genes 47 46 48 45 47 52
rRNA genes 3 3 3 3 3 6
Other RNA genes 3 3 3 8 5
Total number of genes 4960 4757 4333 3425 4837 8003
Pseudogenes 90 163 99 2 114
Total protein CDSs (%) 4907 (98.9) 4706 (98.9) 4279 (98.8) 3377 (98.6) 4779 (98.8) 7940 (99.2)
With function prediction (%) 3511 (70.8) 3313 (69.6) 2890 (66.7) 2248 (65.6) 3080 (63.7) 4898 (61.2)
Without function prediction (%) 1396 (28.2) 1393 (29.3) 1389 (32.0) 1129 (33) 1699 (35.1) 3042 (38.0)
With COGs (%) 3496 (70.5) 3276 (68.8) 3152 (72.7) 2294 (67) 3167 (65.5) 4997 (62.4)
With TIGRfam (%) 1210 (24.4) 1154 (24.2) 1176 (27.1) 1036 (30.2) 1271 (26.3) 1882 (23.5)
In paralogous clusters (%) 2679 (54.0) 2412 (50.7) 2197 (50.7) 1579 (46.1) 2995 (61.9) 5963 (74.5)
Coding for signal peptides (%) 2203 (44.4) 1307 (27.5) 1865 (43.0) 831 (24.3) 1425 (29.5) 2858 (35.7)
Coding for transmembrane proteins (%) 1291 (26) 1106 (23.3) 1082 (24.9) 842 (24.6) 1284 (26.6) 1779 (22.2)
Metadata
Isolation source (Habitat) Tundra soil, Finland Tundra soil, Finland Tundra soil, Finland Acidic mine drainage, Japan Ryegrass soil pasture, Australia Ryegrass soil pasture, Australia
Acidobacteria subdivision (SD) 1 1 1 1 1 3
Genus Granulicella Granulicella Terriglobus Acidobacterium Koribacter Solibacter
Temperature range (°C) 4–28 4–28 4–30 20–37 Mesophile Mesophile
pH range 3.5–6.5 3.5–6.5 4.5–7.5 3–6.0 Acidophile Acidophile
G. mallensis MP5ACTX8 G. tundricola MP5ACTX9 T. saanensis SP1PR4 A. capsulatum ATCC 51196 ‘_K. versatilis_’ Ellin345 ‘_S. usitatus_’ Ellin6076
Genome data
Genome size base pairs (bp) 6 237 577 5 503 984 5 095 226 4 127 356 5 650 368 9 965 640
DNA coding region (bp) 5 452 093 4 762 045 4 578 206 3 552 031 5 056 593 9 017 234
G+C content (mol%) 57.9 60.0 57.3 60.5 58.4 61.9
Total RNA genes 53 52 54 48 58 63
tRNA genes 47 46 48 45 47 52
rRNA genes 3 3 3 3 3 6
Other RNA genes 3 3 3 8 5
Total number of genes 4960 4757 4333 3425 4837 8003
Pseudogenes 90 163 99 2 114
Total protein CDSs (%) 4907 (98.9) 4706 (98.9) 4279 (98.8) 3377 (98.6) 4779 (98.8) 7940 (99.2)
With function prediction (%) 3511 (70.8) 3313 (69.6) 2890 (66.7) 2248 (65.6) 3080 (63.7) 4898 (61.2)
Without function prediction (%) 1396 (28.2) 1393 (29.3) 1389 (32.0) 1129 (33) 1699 (35.1) 3042 (38.0)
With COGs (%) 3496 (70.5) 3276 (68.8) 3152 (72.7) 2294 (67) 3167 (65.5) 4997 (62.4)
With TIGRfam (%) 1210 (24.4) 1154 (24.2) 1176 (27.1) 1036 (30.2) 1271 (26.3) 1882 (23.5)
In paralogous clusters (%) 2679 (54.0) 2412 (50.7) 2197 (50.7) 1579 (46.1) 2995 (61.9) 5963 (74.5)
Coding for signal peptides (%) 2203 (44.4) 1307 (27.5) 1865 (43.0) 831 (24.3) 1425 (29.5) 2858 (35.7)
Coding for transmembrane proteins (%) 1291 (26) 1106 (23.3) 1082 (24.9) 842 (24.6) 1284 (26.6) 1779 (22.2)
Metadata
Isolation source (Habitat) Tundra soil, Finland Tundra soil, Finland Tundra soil, Finland Acidic mine drainage, Japan Ryegrass soil pasture, Australia Ryegrass soil pasture, Australia
Acidobacteria subdivision (SD) 1 1 1 1 1 3
Genus Granulicella Granulicella Terriglobus Acidobacterium Koribacter Solibacter
Temperature range (°C) 4–28 4–28 4–30 20–37 Mesophile Mesophile
pH range 3.5–6.5 3.5–6.5 4.5–7.5 3–6.0 Acidophile Acidophile

Plasmids in G. tundricola MP5ACTX9, pACIX901, 0.48 Mbp; pACIX902, 0.3 Mbp; pACIX903, 0.19 Mbp; pACIX904, 0.12 Mbp; pACIX905, 0.12 Mbp.

Data from Männistö et al. (2012).

Data from Männistö et al. (2011).

Data from Kishimoto et al. (1991).

Data from Ward et al. (2009).

Comparison of general genome and physiological features of six Acidobacteria species

G. mallensis MP5ACTX8 G. tundricola MP5ACTX9 T. saanensis SP1PR4 A. capsulatum ATCC 51196 ‘_K. versatilis_’ Ellin345 ‘_S. usitatus_’ Ellin6076
Genome data
Genome size base pairs (bp) 6 237 577 5 503 984 5 095 226 4 127 356 5 650 368 9 965 640
DNA coding region (bp) 5 452 093 4 762 045 4 578 206 3 552 031 5 056 593 9 017 234
G+C content (mol%) 57.9 60.0 57.3 60.5 58.4 61.9
Total RNA genes 53 52 54 48 58 63
tRNA genes 47 46 48 45 47 52
rRNA genes 3 3 3 3 3 6
Other RNA genes 3 3 3 8 5
Total number of genes 4960 4757 4333 3425 4837 8003
Pseudogenes 90 163 99 2 114
Total protein CDSs (%) 4907 (98.9) 4706 (98.9) 4279 (98.8) 3377 (98.6) 4779 (98.8) 7940 (99.2)
With function prediction (%) 3511 (70.8) 3313 (69.6) 2890 (66.7) 2248 (65.6) 3080 (63.7) 4898 (61.2)
Without function prediction (%) 1396 (28.2) 1393 (29.3) 1389 (32.0) 1129 (33) 1699 (35.1) 3042 (38.0)
With COGs (%) 3496 (70.5) 3276 (68.8) 3152 (72.7) 2294 (67) 3167 (65.5) 4997 (62.4)
With TIGRfam (%) 1210 (24.4) 1154 (24.2) 1176 (27.1) 1036 (30.2) 1271 (26.3) 1882 (23.5)
In paralogous clusters (%) 2679 (54.0) 2412 (50.7) 2197 (50.7) 1579 (46.1) 2995 (61.9) 5963 (74.5)
Coding for signal peptides (%) 2203 (44.4) 1307 (27.5) 1865 (43.0) 831 (24.3) 1425 (29.5) 2858 (35.7)
Coding for transmembrane proteins (%) 1291 (26) 1106 (23.3) 1082 (24.9) 842 (24.6) 1284 (26.6) 1779 (22.2)
Metadata
Isolation source (Habitat) Tundra soil, Finland Tundra soil, Finland Tundra soil, Finland Acidic mine drainage, Japan Ryegrass soil pasture, Australia Ryegrass soil pasture, Australia
Acidobacteria subdivision (SD) 1 1 1 1 1 3
Genus Granulicella Granulicella Terriglobus Acidobacterium Koribacter Solibacter
Temperature range (°C) 4–28 4–28 4–30 20–37 Mesophile Mesophile
pH range 3.5–6.5 3.5–6.5 4.5–7.5 3–6.0 Acidophile Acidophile
G. mallensis MP5ACTX8 G. tundricola MP5ACTX9 T. saanensis SP1PR4 A. capsulatum ATCC 51196 ‘_K. versatilis_’ Ellin345 ‘_S. usitatus_’ Ellin6076
Genome data
Genome size base pairs (bp) 6 237 577 5 503 984 5 095 226 4 127 356 5 650 368 9 965 640
DNA coding region (bp) 5 452 093 4 762 045 4 578 206 3 552 031 5 056 593 9 017 234
G+C content (mol%) 57.9 60.0 57.3 60.5 58.4 61.9
Total RNA genes 53 52 54 48 58 63
tRNA genes 47 46 48 45 47 52
rRNA genes 3 3 3 3 3 6
Other RNA genes 3 3 3 8 5
Total number of genes 4960 4757 4333 3425 4837 8003
Pseudogenes 90 163 99 2 114
Total protein CDSs (%) 4907 (98.9) 4706 (98.9) 4279 (98.8) 3377 (98.6) 4779 (98.8) 7940 (99.2)
With function prediction (%) 3511 (70.8) 3313 (69.6) 2890 (66.7) 2248 (65.6) 3080 (63.7) 4898 (61.2)
Without function prediction (%) 1396 (28.2) 1393 (29.3) 1389 (32.0) 1129 (33) 1699 (35.1) 3042 (38.0)
With COGs (%) 3496 (70.5) 3276 (68.8) 3152 (72.7) 2294 (67) 3167 (65.5) 4997 (62.4)
With TIGRfam (%) 1210 (24.4) 1154 (24.2) 1176 (27.1) 1036 (30.2) 1271 (26.3) 1882 (23.5)
In paralogous clusters (%) 2679 (54.0) 2412 (50.7) 2197 (50.7) 1579 (46.1) 2995 (61.9) 5963 (74.5)
Coding for signal peptides (%) 2203 (44.4) 1307 (27.5) 1865 (43.0) 831 (24.3) 1425 (29.5) 2858 (35.7)
Coding for transmembrane proteins (%) 1291 (26) 1106 (23.3) 1082 (24.9) 842 (24.6) 1284 (26.6) 1779 (22.2)
Metadata
Isolation source (Habitat) Tundra soil, Finland Tundra soil, Finland Tundra soil, Finland Acidic mine drainage, Japan Ryegrass soil pasture, Australia Ryegrass soil pasture, Australia
Acidobacteria subdivision (SD) 1 1 1 1 1 3
Genus Granulicella Granulicella Terriglobus Acidobacterium Koribacter Solibacter
Temperature range (°C) 4–28 4–28 4–30 20–37 Mesophile Mesophile
pH range 3.5–6.5 3.5–6.5 4.5–7.5 3–6.0 Acidophile Acidophile

Plasmids in G. tundricola MP5ACTX9, pACIX901, 0.48 Mbp; pACIX902, 0.3 Mbp; pACIX903, 0.19 Mbp; pACIX904, 0.12 Mbp; pACIX905, 0.12 Mbp.

Data from Männistö et al. (2012).

Data from Männistö et al. (2011).

Data from Kishimoto et al. (1991).

Data from Ward et al. (2009).

The physiology of the three tundra soil strains, G. mallensis, G. tundricola, and T. saanensis, has been described in detail (Männistö et al., 2011, 2012), and comparative data with strains A. capsulatum, ‘_K. versatilis_’, and ‘_S. usitatus_’ are presented in Table 1. The tundra soil strains are cold-adapted (grow at temperatures from +4 to 28 °C) compared with the mesophilic strains A. capsulatum, ‘_K. versatilis_’, and ‘_S. usitatus_’ (Kishimoto et al., 1991; Ward et al., 2009; Challacombe et al., 2011) isolated from temperate environments (Table 1). All six Acidobacteria strains are acidophiles. The two Granulicella strains are able to grow at an acidic pH range (pH 3.5–6.5), while T. saanensis grows at a pH range of 4.5–7.5.

General genome comparisons

Organisms occupying the same environmental niche are intuitively more functionally related to each other than to the outsiders. Here, we assessed organism functional similarity using whole-genome homology. Comparisons of the six Acidobacteria genomes were made by all-against-all sequence comparisons at the level of protein CDS, using HSSP distance as a measure of similarity (Rost, 2002). In short, the similarity of genome A to genome B is computed as the number of genes in A that have homologues in B to the total number of genes in A (at a given HSSP cutoff). At each cutoff (range −5 to 45), we computed the similarity of all six genomes in our study. For our analysis, we assume that a higher number of functionally similar proteins indicate overall organism similarity. Intuitively, higher stringency in assigning homology decreases the number of homologous genes; that is, with increasing HSSP cutoffs, there is a decrease in the percentage of genes common to all six genomes (Fig. 3a, _X_-axis). However, highly similar genes are still assigned homology even at high stringency thresholds, as illustrated by the increase in the percentage of genes common to only the three (similar) tundra genomes (Fig. 3a, _Y_-axis; HSSP −5 to HSSP 34). Thus, the increase in HSSP cutoffs stringency from −5 to 22 (Fig. 3a; beginning of the curve plateau) progressively better groups the tundra genomes closer to each other (increasing along the _Y_-axis) and away from the three other strains of Acidobacteria (decreasing along the _X_-axis). Cutoffs stricter than 22 remove homologues from both sets, with the tundra overlap set being replenished by the nontundra ‘drop-outs’ until HSSP = 34. Beyond this threshold, fewer and fewer genes find homologues in any of the genomes.

Comparison of protein CDSs encoded in the genomes of six strains of Acidobacteria. Functional similarity was assessed using whole-genome homology comparing the three tundra soil strains, Granulicella mallensis, Granulicella tundricola, and Terriglobus saanensis, to three other strains of Acidobacteria, Acidobacterium capsulatum, ‘K. versatilis’, and ‘S. usitatus’. (a) At each cutoff (range −5 to 45), we computed the similarity of all six genomes. For each HSSP cutoff, the percentage of genes in each tundra genome common only to the three tundra genomes vs. to all six genomes was mapped. The ‘random’ baseline for this curve (gray line) is represented by Azotobacter vinelandii (see Materials and methods and Fig. S2 for full ‘random’ graph). With progressively stricter cutoffs (−5 to 22), the number of genes common to all genomes decreases in favor of the overlap set of only the tundra genomes, that is, for a given tundra gene X, homologues in nontundra genomes are less similar to X than its tundra homologues, thus ‘dropping out’ of the overlap set earlier. (b) Venn diagram showing the number of genes in all possible genome overlap sets at HSSP = 22. Each genome is represented with the corresponding total number of CDSs. Numbers in intersections indicate the number of shared homologues between two or three genomes. Homologues shared only by three tundra soil strains and not found in the genomes of the other three Acidobacteria strains are shown in the triangle. Numbers outside the intersections within each circle represent number of genes specific to each genome.

Comparison of protein CDSs encoded in the genomes of six strains of Acidobacteria. Functional similarity was assessed using whole-genome homology comparing the three tundra soil strains, Granulicella mallensis, Granulicella tundricola, and Terriglobus saanensis, to three other strains of Acidobacteria, Acidobacterium capsulatum, ‘_K. versatilis_’, and ‘_S. usitatus_’. (a) At each cutoff (range −5 to 45), we computed the similarity of all six genomes. For each HSSP cutoff, the percentage of genes in each tundra genome common only to the three tundra genomes vs. to all six genomes was mapped. The ‘random’ baseline for this curve (gray line) is represented by Azotobacter vinelandii (see Materials and methods and Fig. S2 for full ‘random’ graph). With progressively stricter cutoffs (−5 to 22), the number of genes common to all genomes decreases in favor of the overlap set of only the tundra genomes, that is, for a given tundra gene X, homologues in nontundra genomes are less similar to X than its tundra homologues, thus ‘dropping out’ of the overlap set earlier. (b) Venn diagram showing the number of genes in all possible genome overlap sets at HSSP = 22. Each genome is represented with the corresponding total number of CDSs. Numbers in intersections indicate the number of shared homologues between two or three genomes. Homologues shared only by three tundra soil strains and not found in the genomes of the other three Acidobacteria strains are shown in the triangle. Numbers outside the intersections within each circle represent number of genes specific to each genome.

The gene pool shared by the genomes of three tundra soil strains G. mallensis, G. tundricola, and T. saanensis at HSSP = 22 (beginning of curve plateau, Fig 3a) is depicted by a Venn diagram (Fig. 3b; numbers in each intersection indicate shared CDSs). Some genes were specific to one strain only: 1679 CDSs in G. mallensis, 1640 CDSs in G. tundricola, and 1249 CDSs in T. saanensis. 1900–1966 CDSs were shared by all tundra genomes with more than half (1551–1586 CDSs; data not shown) of these genes shared by all six Acidobacteria strains (the three tundra soil strains and A. capsulatum, ‘_K. versatilis_’, and ‘_S. usitatus_’). Further analysis identified a gene pool shared only by the three tundra Acidobacteria (but not identified in three other species) consisting of 380 CDSs in G. mallensis, 370 CDSs in G. tundricola (including plasmids: 21 genes in pACIX902, 20 genes in pACIX901, 11 genes in pACIX903, four genes in pACIX904, and two genes in pACIX905), and 340 CDSs in T. saanensis (Fig 3b, box). This gene pool was assigned to 261 COG and 273 pfam functions (Table S2), while 47 genes had no assigned function. Many of the CDSs in this gene pool were assigned via COG annotations to functions of metabolism and transport of carbohydrates (Table S2). These included glycoside hydrolases (GHs) of family GH1 (pfam00232), GH2 (pfam00703, pfam00754, pfam02836), GH20 (pfam00728), GH28 (pfam00295), GH31 (pfam01055), GH57 (pfam03065, pfam09210), GH88 (pfam07470), and GH92 (pfam07971), alginate lyase of polysaccharide family PL5 (pfam05426), glycosyl transferases (GTs) of family GT1 (pfam00534), GT2 (pfam00535), and GT9 (pfam01075), and transporters of Caenorhabditis elegans ORF (CEO) family (DUF1632)/sugar transport protein (pfam06800) and major facilitator superfamily (MFS) (pfam00083, pfam07690).

Functional diversity in Acidobacteria genomes

For all six strains of Acidobacteria, predicted genes were assigned to four main functional categories – metabolism (Me), cellular processes (Cp), information storage and processing (Isp), and poorly characterized (Pc) within the Cluster of Orthologous Groups (COG) database (Tatusov et al., 1997) as shown in Fig. 4. For metabolism (Me), the highest percent of genes could be assigned to carbohydrate transport and metabolism [G] (9–10%), with the highest abundance in genomes of G. mallensis and T. saanensis among all six Acidobacteria strains. This was followed by amino acid transport and metabolism [E] (7–8%), energy production and conversion [C] (5–6%), and lipid transport and metabolism [I] (3–4%). For cellular processes (Cp), the majority of genes were assigned to cell wall/membrane/envelope biogenesis [M] (8–9%) followed by signal transduction mechanisms [T] (4–5%) and for information storage and processing (Isp) to transcription [K] (7–9%). We infer that the genomes of the three tundra soil strains encode for functions involved in transport and utilization of nutrients, mainly carbohydrates for energy production and cell biogenesis to maintain cell integrity in cold tundra soils.

Comparison of gene content by COG functional categories grouped by four major categories: metabolism (Me), cellular processes (Cp), information storage and processing (Isp), and poorly characterized (Pc) in the genomes of six strains of Acidobacteria.

Comparison of gene content by COG functional categories grouped by four major categories: metabolism (Me), cellular processes (Cp), information storage and processing (Isp), and poorly characterized (Pc) in the genomes of six strains of Acidobacteria.

Carbohydrate transport and metabolism

To explore the genetic potential of the three tundra soil strains to metabolize organic carbon, we analyzed their genomes for CDSs predicted to code for modules that catalyze the breakdown, biosynthesis, or modification of carbohydrates of the carbohydrate-active enzymes (CAZy) family (http://www.cazy.org; Cantarel et al., 2009). CDSs predicted to encode for CAZymes were more abundant in the genomes of the three tundra soil strains, G. mallensis (n = 321), G. tundricola (n = 215), and T. saanensis (n = 244), as compared to the genomes of two other strains of subdivision 1, A. capsulatum (161) and ‘_K. versatilis_’ (135) (Fig. 5). The genomes of G. mallensis, G. tundricola, and T. saanensis contained gene modules spanning four major CAZyme super families of glycoside hydrolases (GHs) (n = 166, 103, 110, respectively), glycosyl transferases (GTs) (n = 77, 74, 90), polysaccharide lyases (PLs) (n = 9, 4, 4), carbohydrate esterases (CEs) (n = 16, 15, 16), and noncatalytic carbohydrate-binding modules (CBMs) (n = 53, 19, 24). This indicates that the tundra soil strains are abundant in genes encoding for functional activities required for rearrangement of oligo- and polysaccharides. Predicted gene modules encompassed 59 different families of glycoside hydrolases (GHs, 21 GTs, seven PLs, nine CEs, and 12 CBMs (Table S3), emphasizing the elaborate set of enzymes needed for breakdown of different types of plant and/or microbial polysaccharides as well as for the biosynthesis of various polysaccharides. The three tundra Acidobacteria strains contained a large number of predicted CDSs encoding for sugar transporters of the major facilitator superfamily.

Distribution of gene content (% of total CDSs) encoding for four major CAZy families: glycoside hydrolases (GHs), GTs, polysaccharide lyases (PLs), carbohydrate esterases (CEs), and noncatalytic carbohydrate-binding modules (CBMs) in the genomes of six strains of Acidobacteria.

Distribution of gene content (% of total CDSs) encoding for four major CAZy families: glycoside hydrolases (GHs), GTs, polysaccharide lyases (PLs), carbohydrate esterases (CEs), and noncatalytic carbohydrate-binding modules (CBMs) in the genomes of six strains of Acidobacteria.

Biodegradation of structural and storage polysaccharides

Cellulose and hemicelluloses are the most abundant plant structural carbon polymers found in the biosphere, and therefore, their degradation by microorganisms represents a significant part of the carbon cycle. The efficient degradation of polysaccharides requires the concerted action of many catalytic enzymes and/or noncatalytic CBMs, which facilitate the targeting of enzymes to the insoluble polysaccharides (Warren, 1996). The tundra soil Acidobacteria are versatile heterotrophs isolated using selective plant-based carbon sources (Männistö et al., 2011, 2012). In order to explore the metabolic potential of the three tundra soil strains to hydrolyze biomass polysaccharides, we analyzed their genomes for CDSs predicted to code for main-chain and side-chain cleaving enzymes of the CAZy family. The genomic data were validated by biochemical assays to bridge genome predictions to biochemical activities encoded in their genomes. We identified predicted CDSs of CAZyme families involved in breakdown of plant structural polysaccharides such as hemicelluloses, celluloses, pectin, and storage polysaccharides such as starch/glycogen (Table S3). The tundra Acidobacteria strains grew on a number of plant- and microbe-based polysaccharides as single carbon sources (Table 2).

Comparison of carbon substrate utilization/hydrolysis by six strains of Acidobacteria

G. mallensis MP5ACTX8 G. tundricola MP5ACTX9 T. saanensis SP1PR4 A. capsulatum ATCC 51196 ‘_K. versatilis_’ Ellin345 ‘_S. usitatus_’ Ellin6076
Utilization of Mono- and disaccharides
d-arabinose nd + +
Cellobiose + + + + + +
d-Fructose + + + nd + +
d-galactose + + + + + +
d-glucose + + + + + +
Lactose + + + + + +
Lactulose + + nd nd nd nd
d-lyxose nd nd nd
d-maltose + + + + nd nd
d-mannose + + + + + +
d-ribose + + nd + nd
Sucrose + + + nd + +
d-trehalose + + + + nd nd
d-xylose + + + + nd nd
d-melezitose + + + nd nd nd
d-raffinose + + + nd nd nd
_N_-acetyl-d-glucosamine + + + nd nd nd
Polysaccharides
Laminarin + + + nd nd nd
Pectin + + + nd nd +
Lichenan + nd nd nd
Starch + + + + nd +
Xylan + + +
Pullulan + nd nd nd
Alginate nd nd nd
Cellulose +
Chitin nd nd nd
Chitosan nd nd nd
Enzymatic and other assays
Cellulose hydrolysis (plate assay) + + nd nd nd
Chitinase nd nd nd
Chitobiase + + + nd nd nd
G. mallensis MP5ACTX8 G. tundricola MP5ACTX9 T. saanensis SP1PR4 A. capsulatum ATCC 51196 ‘_K. versatilis_’ Ellin345 ‘_S. usitatus_’ Ellin6076
Utilization of Mono- and disaccharides
d-arabinose nd + +
Cellobiose + + + + + +
d-Fructose + + + nd + +
d-galactose + + + + + +
d-glucose + + + + + +
Lactose + + + + + +
Lactulose + + nd nd nd nd
d-lyxose nd nd nd
d-maltose + + + + nd nd
d-mannose + + + + + +
d-ribose + + nd + nd
Sucrose + + + nd + +
d-trehalose + + + + nd nd
d-xylose + + + + nd nd
d-melezitose + + + nd nd nd
d-raffinose + + + nd nd nd
_N_-acetyl-d-glucosamine + + + nd nd nd
Polysaccharides
Laminarin + + + nd nd nd
Pectin + + + nd nd +
Lichenan + nd nd nd
Starch + + + + nd +
Xylan + + +
Pullulan + nd nd nd
Alginate nd nd nd
Cellulose +
Chitin nd nd nd
Chitosan nd nd nd
Enzymatic and other assays
Cellulose hydrolysis (plate assay) + + nd nd nd
Chitinase nd nd nd
Chitobiase + + + nd nd nd

nd, no data available.

Data from Männistö et al. (2012) or this study.

Data from Männistö et al. (2011) or this study.

Data from Kishimoto et al. (1991).

Data from Ward et al. (2009).

Comparison of carbon substrate utilization/hydrolysis by six strains of Acidobacteria

G. mallensis MP5ACTX8 G. tundricola MP5ACTX9 T. saanensis SP1PR4 A. capsulatum ATCC 51196 ‘_K. versatilis_’ Ellin345 ‘_S. usitatus_’ Ellin6076
Utilization of Mono- and disaccharides
d-arabinose nd + +
Cellobiose + + + + + +
d-Fructose + + + nd + +
d-galactose + + + + + +
d-glucose + + + + + +
Lactose + + + + + +
Lactulose + + nd nd nd nd
d-lyxose nd nd nd
d-maltose + + + + nd nd
d-mannose + + + + + +
d-ribose + + nd + nd
Sucrose + + + nd + +
d-trehalose + + + + nd nd
d-xylose + + + + nd nd
d-melezitose + + + nd nd nd
d-raffinose + + + nd nd nd
_N_-acetyl-d-glucosamine + + + nd nd nd
Polysaccharides
Laminarin + + + nd nd nd
Pectin + + + nd nd +
Lichenan + nd nd nd
Starch + + + + nd +
Xylan + + +
Pullulan + nd nd nd
Alginate nd nd nd
Cellulose +
Chitin nd nd nd
Chitosan nd nd nd
Enzymatic and other assays
Cellulose hydrolysis (plate assay) + + nd nd nd
Chitinase nd nd nd
Chitobiase + + + nd nd nd
G. mallensis MP5ACTX8 G. tundricola MP5ACTX9 T. saanensis SP1PR4 A. capsulatum ATCC 51196 ‘_K. versatilis_’ Ellin345 ‘_S. usitatus_’ Ellin6076
Utilization of Mono- and disaccharides
d-arabinose nd + +
Cellobiose + + + + + +
d-Fructose + + + nd + +
d-galactose + + + + + +
d-glucose + + + + + +
Lactose + + + + + +
Lactulose + + nd nd nd nd
d-lyxose nd nd nd
d-maltose + + + + nd nd
d-mannose + + + + + +
d-ribose + + nd + nd
Sucrose + + + nd + +
d-trehalose + + + + nd nd
d-xylose + + + + nd nd
d-melezitose + + + nd nd nd
d-raffinose + + + nd nd nd
_N_-acetyl-d-glucosamine + + + nd nd nd
Polysaccharides
Laminarin + + + nd nd nd
Pectin + + + nd nd +
Lichenan + nd nd nd
Starch + + + + nd +
Xylan + + +
Pullulan + nd nd nd
Alginate nd nd nd
Cellulose +
Chitin nd nd nd
Chitosan nd nd nd
Enzymatic and other assays
Cellulose hydrolysis (plate assay) + + nd nd nd
Chitinase nd nd nd
Chitobiase + + + nd nd nd

nd, no data available.

Data from Männistö et al. (2012) or this study.

Data from Männistö et al. (2011) or this study.

Data from Kishimoto et al. (1991).

Data from Ward et al. (2009).

Hemicelluloses are highly complex heteropolysaccharides requiring a battery of enzymes, belonging to GH and CE families, required for hydrolysis of xylan-, mannan-, and arabinofuranosyl-containing hemicelluloses (Shallom & Shoham, 2003). We identified predicted CDSs encoding for hemicellulolytic enzymes in the genomes of the three tundra soil strains (Table S2). These included endoxylanases represented by family GH10 and exoxylanases represented by family GH39 that successively hydrolyze xylan into short xylooligomers and xylose. In addition, CDSs for acetyl xylan esterases of carbohydrate esterase families CE1 and CE4 that hydrolyze the acetyl substituents of xylose moieties were identified in the genomes of all three tundra soil strains (Table S3). Predicted CDSs for family GH3, GH43, and GH51, which represent α-l-arabinofuranosidases required to cleave l-arabinofuranose side chains found in softwood xylans and GH1, GH2, and GH5 required for hydrolysis of β-mannan-based polymers, were also present in the genomes of the three tundra soil strains. In addition, a large number of CDSs of family GH27, GH36, and GH57 that represent α-galactosidases and GH1, GH3, GH30, and GH116, which represent β-glucosidases, were identified in tundra soil strains (Table S3). By cultivation assays, xylan (from birch wood) degradation was not detected in any of the three tundra strain as assayed by turbidity, CO2 production, or Congo red staining (Männistö et al., 2011, 2012; Table 2). Further studies are underway to assay the xylanase activities against different substrates and under different conditions.

Degradation of cellulose requires three enzyme activities, including endoglucanase, exoglucanase (or cellobiohydrolase), and β-glucosidase. Recently, cellulases were described within 13 GH families, of which GH5 and GH9 appear to have the largest number of biochemically characterized bacterial cellulases with both endo- and exocellulase activity, while no exocellulase activity is identified for GH8 in the CAZy database (Sukharnikov et al., 2011). We identified CDSs belonging to five different glycoside hydrolase families that represent cellulases: GH5, GH8, GH9, GH12, and GH51 (Table S3). CDSs for GH5 were identified in the genomes of G. mallensis and G. tundricola, but not in T. saanensis. CDS for GH9 was only identified in G. mallensis. GH9 cellulases were also present in the genomes of A. capsulatum, ‘_K. versatilis_’, and ‘_S. usitatus_’ (Ward et al., 2009). However, no CDSs for exoglucanases or cellobiohydrolase were identified in the genomes of any of the three tundra soil strains.

Carbohydrate-binding modules that are likely involved in cellulose degradation were identified in the three tundra strains (Table S2), which included CBMs binding to GH16 in both Granulicella strains and those binding to GH27 and GH36 in G. mallensis and T. saanensis genomes, but not in G. tundricola. CBMs binding to GH64 and GH55 representing β-1,3-glucanase were identified in all three tundra soil strains, while CBM6 binding to GH55 and GH16 was identified in genomes of G. mallensis and G. tundricola, respectively. CBM6 is reported to have both xylan and cellulose-binding function (Boraston et al., 2004).

Although the three tundra soil Acidobacteria contained cellulases from several glycoside hydrolase families, none of them effectively utilized cellulose when assayed by turbidity or CO2 production (Männistö et al., 2011, 2012; Table 2). No increase in turbidity or CO2 production was detected after 3 weeks of incubation in liquid culture with CMC and a small amount (100 mg L−1) of yeast extract. Subdivision 1 Acidobacteria have been linked to cellulose degradation in sphagnum peat, but the rates of cellulose degradation are extremely low (Pankratov et al., 2011). To determine whether the presence of an easily degradable substrate would trigger CMC hydrolysis, the strains were inoculated on plates containing CMC with peptone and yeast extract and with cellobiose, peptone, and yeast extract. After 3 weeks of incubation, G. mallensis strain MP5ACTX8 scored positive for CMC hydrolysis in plates with both amendments, while G. tundricola strain MP5ACTX9 produced clearing zones, indicative of CMC hydrolysis, only in plates amended with cellobiose. Terriglobus saanensis strain SP1PR4 did not produce a clearing zone on either of the plates. Further studies are needed to determine the factors that trigger cellulose utilization by these strains and by Acidobacteria in general.

Degradation of pectin, starch, and chitin

Pectin cross-links cellulose and hemicellulose fibers and is primarily degraded by a battery of enzymes that include endo- and exopolygalacturonases represented by glycoside hydrolases of family GH28, pectate lyases, and oligo-d-galactosiduronate lyases of the polysaccharide lyase (PL) family and pectin esterases of the carbohydrate esterase (CE) families CE8 and CE12 (Jayani et al., 2005; Abbott & Boraston, 2008). The genomes of all three tundra soil strains contained large number of CDSs of the GH28 family predicted to code for polygalacturonases. We identified CDSs encoding for pectate lyases of families PL1 and PL10 in G. mallensis, PL10 in G. tundricola, and family PL9 in T. saanensis. CDSs for pectin methyl esterases of family CE12 were identified in the genomes of all three tundra soil strains, and pectin acetyl esterases of family CE8 were identified in T. saanensis and G. tundricola. Homologues for CE8 and CE12 were not found in ‘_K. versatilis_’, but were present in A. capsulatum. CDSs encoding for polysaccharide lyases that represent alginate lyases of family PL5 were identified in all three tundra soil strains and PL7 in G. mallensis. However, homologues for alginate lyases were not identified in the genomes of A. capsulatum, ‘_K. versatilis_’, or ‘_S. usitatus_’.

Starch is hydrolyzed by the combined action of α-amylases, β-amylases, other exo-α-1, 4-glucanases, and glucoamylase. We identified in the genomes of G. mallensis, G. tundricola, and T. saanensis a large number of CDSs for glycoside hydrolases of family GH13 (n = 12, 16, 12, respectively) GH15 (n = 2, 2, 2) GH31 (n = 5, 3, 3), and GH57 (n = 2, 2, 2). Homologues of these glycoside hydrolases were also identified in A. capsulatum, ‘_K. versatilis_’, and ‘_S. usitatus_’. CDSs for the carbohydrate-binding module CBM48-GH13 were identified in all three tundra soil strains, while CBM41-CBM41-CBM48-GH13 involved in glycogen binding were identified in the G. tundricola genome. All three strains utilized laminarin, starch, and pectin, as detected by both CO2 production and increase in turbidity (Table 2). In addition, G. tundricola strain MP5ACTX9 grew well on pullulan, and G. mallensis MP5ACTX8 grew well on lichenin, the major polysaccharide of lichens.

Chitin is the second most prominent biopolymer (next to cellulose) found in arthropodal exoskeleton composed of β(1-4)-linked _N_-acetyl-d-glucosamine (GlcNAc). Chitinases (GH18 and GH19 family) hydrolyze the β (1-4)-linkages, whereby GH18 include endoglycosidases that hydrolyze chitobiose core of N-linked glycoproteins. CDSs for GH18 chitinases were identified in the genomes of G. mallensis and T. saanensis, but not in G. tundricola. A GH19 chitinase was identified only in T. saanensis. We did not find any chitosanases (GH46) that hydrolyze chitosan. CDSs for predicted xylanase/chitin deacetylase and GH18-CE4-GT2 were identified in all three tundra soil strains. A large number of CDSs for the carbohydrate-binding module CBM5, CBM12, CBM13, and CBM35 with chitin-binding function were identified in G. mallensis genome. CBM with cellulose-binding domains may also bind to chitin, given the similar structures of cellulose and chitin (Warren, 1996). Nonetheless, no chitinase activity was detected in any of the strains after 10 day-incubation with chitin azure (chitin labeled with Remazol Brilliant Violet). However, all strains were positive for chitobioase activity (Table 2) when assayed using 4-methylumbelliferyl-_N_-acetyl-β-d-glucosaminide as the substrate (O'Brien & Colwell, 1987).

Biosynthesis of extra-polymeric substances

A large number of gene modules representing GTs of families GT1 (n = 8, 1, 20), GT2 (n = 27, 32, 28), and GT4 (n = 22, 20, 22) were identified in genomes of all three tundra soil strains, G. mallensis, G. tundricola, and T. saanensis, respectively (Table S3). Predicted CDSs for biosynthesis of nucleotide sugars such as dTDP-l-rhamnose, GDP-mannose, cytidine 5′-monophospho-3-deoxy-d-_manno_-2-octulosonic acid (CMP-KDO), GDP-glucose, and other complex di-, oligo-, and polysaccharides were identified (Table S4). These include CDSs encoding for cellulose synthase (UDP-forming) (glycos_transf_2), α,α-trehalose phosphate synthase [UDP-forming] (glyco_transf_20), ADP-glucose: starch glucosyl transferase (glyco_transf_5), UDP-glucose: ceramide β-glucosyltransferase (glycos_transf_21) involved in biosynthesis of cellulose, trehalose, starch, hopanoid, and capsular/free exopolysaccharide (EPS).

Exopolysaccharide biosynthesis (EPS)

A gene cluster containing CDSs for capsular polysaccharide synthesis protein (COGs 3206) polysaccharide export protein (COGs 1596) and CDSs encoding for GTs of family GT1 was identified in the genomes of all three tundra soil strains. A gene cluster containing CDSs for EpsH (Exosortase_EpsH) (COGs 1368) (AciPR4_0588), EpsI (AciPR4_0589), and a tetratricopeptide (TPR_1) repeat-containing protein (AciPR4_0590) was identified in the T. saanensis genome. Three copies of the exopolysaccharide H (EpsH) gene were found in the genome of G. mallensis (AciX8_0208, AciX8_2453, and AciX8_4634), but not G. tundricola. EPS includes capsular polysaccharide (CPS), as well as free extracellular polysaccharide (slime). EPS is thought to protect cells from desiccation or other environmental stresses and serves as a cryoprotectant of enzymes from cold-adapted microorganisms assisting in colonization of various ecological niches (Roberts, 1996; Nicolaus et al., 2010).

Cellulose biosynthesis

Cellulose biogenesis is reported for several bacteria, of which the most extensively studied is Gluconacetobacter xylinus (formerly Acetobacter xylinum) (Ross et al., 1991; Römling, 2002). Eight different proteins participate in the cellulose biosynthetic pathway and its regulation; these are UDP-glucose pyrophosphorylase, the cellulose synthase, diguanylate cyclase, phosphodiesterase PDE-A and PDE-B, and the recently discovered bacterial cellulose synthesis (bcs) operon that encodes four proteins, BcsA, BcsB, BcsZ, and BcsC. We identified predicted CDSs for UDP-forming GT of family GT2 (Table S3), which encodes for cellulose synthase of the cellulose biosynthesis pathway (MetaCyc: PWY-1001). In the genomes of all three tundra Acidobacteria, clusters of genes were identified in close neighborhood of the cellulose synthase gene (bcsAB: AciX8_2186, AciX9_2052, AciPR4_1394, AciPR4_3357), which included cellulase (endoglucanase Y) of family GH8 (bscZ: AciX8_2185, AciX9_2051, and AciPR4_1395), cellulose synthase operon protein (bcsC: AciX8_2184, AciX9_2050, and AciPR4_1397), and a cellulose synthase operon protein (yhjQ: AciX8_2187, AciX9_2053, and AciPR4_1393) (Fig. S3). Among other Acidobacteria, a cluster of genes, yhjQ (ACP_0074), bcsB (ACP_0075), and a putative endoglucanase Y (ACP_0076) were present in A. capsulatum. However, no homologue for bcsC was found in the A. capsulatum genome and spanned intergenic region in the bcs operon. psi-blast search identified tetratricopeptide repeat region (TPR) motif in all the three bcsC genes, which is involved in the assembly of multiprotein complexes (D'Andrea & Regan, 2003). Homologues for yhjQ, bcsB, bcsC, or bscZ were not identified in the genomes of ‘_K. versatilis_’ or ‘_S. usitatus_’. The gene for endo-1, 4-β-glucanase involved in cellulose biosynthesis (GH8) is part of the bacterial cellulose synthesis operon in genomes of the three tundra Acidobacteria and is also observed in the cell operons (celABC and celDE) in the plant-colonizing bacteria Agrobacterium tumefaciens and Rhizobium leguminosarum, where cellulose is suggested as an exopolysaccharide used for colonization of plant host cells (Matthysse et al., 1995; Ausmees et al., 1999).

Starch and trehalose biosynthesis

Glycogen and trehalose are both major bacterial storage carbohydrates used under conditions of limiting growth when an excess of carbon source is available and other nutrients are deficient (Wilson et al., 2010). We identified CDSs for ADP-glucose type glycosyl transferase (Glycos_transf_1, Glyco_transf_5), which encodes for glycogen/starch synthase (GlgA) in the genomes of G. mallensis (AciX8_3243) and T. saanensis (AciPR4_2465), but not in G. tundricola. CDSs for glucan phosphorylase (GlgC) and phosphoglucomutase (PGM), which converts glucose-6-P to glucose-1-phosphate, were present in genomes of the three tundra soil strains. In the common GlgC-GlgA pathway (reviewed by Preiss, 2009), after chain elongation to generate the linear glucan, glycogen is formed by glycogen branching enzyme (GlgB). However, CDSs for glycogen branching enzyme were not identified in any of the three tundra soil strains, nor the two other strains of subdivision 1 Acidobacteria.

Trehalose is reported as a stress-protectant, helping bacteria to survive desiccation, cold, and osmotic stress (Freeman et al., 2010). A gene cluster involved in trehalose biosynthesis from maltose (TreS, MetaCyc: PWY-2622), which included CDSs encoding for trehalose synthase (AciX8_1169, AciX9_2884, AciPR4_3489) and alpha amylase catalytic region (GH13) (AciX8_1170, AciX9_2883, AciPR4_3488), was identified in the genomes of the three tundra soil strains. A second pathway for trehalose biosynthesis from maltodextrins (TreYZ, MetaCyc: PWY-2661) was also identified, which included CDSs encoding for malto-oligosyl trehalose trehalohydrolase, malto-oligosyl trehalose synthase, and glucoamylase/isoamylase of GH15 family (AciX8_1171, AciX9_2882). A predicted CDS for α, α-trehalose phosphate synthase [UDP-forming] of GT family 20 was only identified in the genome of G. mallensis. In E. coli, trehalose-6-phosphate synthase (OtsA), trehalose-6-phosphate phosphatase (OtsB), and cold-shock proteins (Csps) are induced in response to cold shock. We identified CDSs predicted to code for the cold-shock DNA-binding domain protein, CspA in genomes of G. tundricola (n = 9), T. saanensis (n = 4), and G. mallensis (n = 3). CspA, the major cold-shock protein belongs to a family of nine homologous proteins, CspA to CspI in E. coli. In psychrophilic bacteria, basal set of Csps exists and additional Csps appear with more severe cold shocks (Hebraud & Potier, 1999). In addition to Csps (now called cold-induced proteins, CIPs) observed in mesophiles, cold acclimation proteins (Caps) are identified in psychrophilic microorganisms, being constitutively rather than transiently expressed at low temperatures (D'Amico et al., 2006).

Conclusions

Molecular analyses suggest a tremendous diversity of Acidobacteria in tundra and other soil environments, but although ubiquitous, the ecological role of the Acidobacteria remains elusive. Our concerted efforts led to the cultivation of several new slow-growing and fastidious cold-adapted Acidobacteria from tundra soils of northern Finland (Männistö et al., 2011, 2012). The integrated study of the taxonomic, genetic, and functional diversity of Acidobacteria is providing an ecosystem-level understanding of the metabolic networks of Acidobacteria and other species consortia involved in biogeochemical activities in tundra soil environments. We hypothesize that the harsh and changing environmental conditions have selected for a stable bacterial community dominated by Acidobacteria that is only minimally affected by temperature fluctuation and freeze–thaw cycles. Comparative genomic and physiological analysis of these Terriglobus and Granulicella species is providing insights into their roles in organic carbon utilization in Arctic tundra soils and revealing mechanisms promoting their activity and dominance. The genomes of the three tundra soil Acidobacteria contained an abundance of conserved genes/gene clusters encoding for gene modules of the carbohydrate-active enzyme (CAZyme) family. We infer that gene content and biochemical mechanisms encoded in the Acidobacteria genomes strains are shaped to allow for breakdown, utilization, and biosynthesis of diverse structural and storage polysaccharides and resilience to fluctuating temperatures and nutrient-deficient conditions in Arctic tundra soils. We conclude that Acidobacteria communities are central to carbon cycling in Arctic and boreal systems and play a significant role in degradation of accumulated biomass as polar temperatures increase.

Acknowledgements

This work was supported in part by the National Science Foundation (IPY 0732956), the Academy of Finland (Grant 123725), and the New Jersey Agricultural Experiment Station. We thank Tanya Woyke (Joint Genome Institute) and her project team for sequencing and assembly of the genomes. We are greatly thankful to Lynn Goodwin (Joint Genome Institute) for technical assistance. We thank Bernard Henrissat for updating CAZyme-related information for the three tundra soil strain genomes. The work conducted by the US Department of Energy Joint Genome Institute is supported by the US Department of Energy.

References

(

2008

)

Structural biology of pectin degradation by Enterobacteriaceae

.

Microbiol Mol Biol Rev

72

:

301

316

.

(

1997

)

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs

.

Nucleic Acids Res

25

:

3389

3402

.

(

1999

)

Structural and putative regulatory genes involved in cellulose synthesis in Rhizobium leguminosarum bv. trifolii

.

Microbiology

145

:

3

1262

.

(

1999

)

Wide distribution and diversity of members of the bacterial kingdom Acidobacterium in the environment

.

Appl Environ Microbiol

65

:

1731

1737

.

(

2007

)

Acidobacteria phylum sequences in uranium-contaminated subsurface sediments greatly expand the known diversity within the phylum

.

Appl Environ Microbiol

73

:

3113

3116

.

(

2004

)

Solexa Ltd

.

Pharmacogenomics

5

:

433

438

.

(

2004

)

Carbohydrate-binding modules: fine-tuning polysaccharide recognition

.

Biochem J

382

:

769

781

.

(

2010

)

The effect of nutrient deposition on bacterial communities in Arctic tundra soil

.

Environ Microbiol

12

:

1842

1854

.

(

2009

)

The Carbohydrate-Active EnZymes database (CAZy): an expert resource for glycogenomics

.

Nucleic Acids Res

37

:

D233

D238

.

(

2011

)

Biological consequences of ancient gene acquisition and duplication in the large genome of Candidatus Solibacter usitatus Ellin6076

.

PLoS ONE

6

:

e24882

.

(

2010

)

Soil bacterial diversity in the Arctic is not fundamentally different from that found in other biomes

.

Environ Microbiol

12

:

2998

3006

.

(

2006

)

Psychrophilic microorganisms: challenges for life

.

EMBO Rep

7

:

385

389

.

(

2003

)

TPR proteins: the versatile helix

.

Trends Biochem Sci

28

:

655

662

.

(

2005

)

Effects of growth medium, inoculum size, and incubation time on culturability and isolation of soil bacteria

.

Appl Environ Microbiol

71

:

826

834

.

(

2006

)

Phylogenetic analysis and in situ identification of bacteria community composition in an acidic Sphagnum peat bog

.

Appl Environ Microbiol

72

:

2110

2117

.

(

1990

)

Isolation of plant DNA from fresh tissue

.

Focus

12

:

13

15

.

(

2011

)

Influence of plant polymers on the distribution and cultivation of bacteria in the phylum Acidobacteria

.

Appl Environ Microbiol

77

:

586

596

.

(

2009

)

Links between plant community composition, soil organic matter quality and microbial communities in contrasting tundra habitats

.

Oecologia

161

:

113

123

.

(

1998

)

Base-calling of automated sequencer traces using phred. I. Accuracy assessment

.

Genome Res

8

:

175

185

.

et al. (

2010

)

Influence of soil characteristics on the diversity of bacteria in the Southern Brazilian Atlantic Forest

.

Appl Environ Microbiol

76

:

4744

4749

.

(

2007

)

Toward an ecological classification of soil bacteria

.

Ecology

88

:

1354

1364

.

(

2010

)

Identification of the trehalose biosynthetic loci of Pseudomonas syringae and their contribution to fitness in the phyllosphere

.

Environ Microbiol

12

:

1486

1497

.

(

2011

)

The impact of different soil parameters on the community structure of dominant bacteria from nine different soils located on Livingston Island, South Shetland Archipelago, Antarctica

.

FEMS Microbiol Ecol

76

:

476

491

.

(

1998

)

Consed: a graphical tool for sequence finishing

.

Genome Res

8

:

195

202

.

et al. (

1998

)

Sensitivity of boreal forest carbon balance to soil thaw

.

Science

279

:

214

217

.

(

2006

)

Finishing repeat regions automatically with Dupfinisher. Proceedings of the 2006 international conference on bioinformatics and computational biology (Arabnia HR, Valafar H, eds), pp. 141–146. CSREA Press, Las Vegas, NV

.

(

1999

)

Cold shock response and low temperature adaptation in psychrotrophic bacteria

.

J Mol Microbiol Biotechnol

1

:

211

219

.

(

2006

)

Identifying the dominant soil bacterial taxa in libraries of 16S rRNA and 16S rRNA genes

.

Appl Environ Microbiol

72

:

1719

1728

.

(

2005

)

Microbial pectinolytic enzymes: a review

.

Process Biochem

40

:

2931

2944

.

(

2009

)

A comprehensive survey of soil acidobacterial diversity using pyrosequencing and clone library analyses

.

ISME J

3

:

442

453

.

(

2003

)

Laboratory cultivation of widespread and previously uncultured soil bacteria

.

Appl Environ Microbiol

69

:

7210

7215

.

(

2008

)

A rapid and easy method for the detection of microbial cellulases on agar plates using gram's iodine

.

Curr Microbiol

57

:

503

507

.

(

2009

)

Phylogenetic diversity of Acidobacteria in a former agricultural soil

.

ISME J

3

:

378

382

.

(

1991

)

Acidobacterium capsulatum gen. nov., sp. nov. an acidophilic chemoorganotrophic bacterium containing menaquinone from acidic mineral environment

.

Current Microbiol

22

:

1

7

.

(

2008

)

Diversity and in situ quantification of Acidobacteria subdivision 1 in an acidic mining lake

.

FEMS Microbiol Ecol

63

:

107

117

.

(

2009

)

Pyrosequencing-based assessment of soil pH as a predictor of soil bacterial community structure at the continental scale

.

Appl Environ Microbiol

75

:

5111

5120

.

(

2009

)

Distribution patterns of the members of phylum acidobacteria in global soil samples

.

J Microbiol Biotechnol

19

:

1281

1287

.

(

2008

)

Members of the phylum Acidobacteria are dominant and metabolically active in rhizosphere soil

.

FEMS Microbiol Lett

285

:

263

269

.

et al. (

2009

)

HAMAP: a database of completely sequenced microbial proteome sets and manually curated microbial protein families in UniProtKB/Swiss-Prot

.

Nucleic Acids Res

37

:

D471

D478

.

(

2004

)

Global change: carbon conundrum on the tundra

.

Nature

431

:

406

408

.

(

2007

)

Bacterial communities in Arctic fjelds of Finnish Lapland are stable but highly pH-dependent

.

FEMS Microbiol Ecol

59

:

452

465

.

(

2009

)

Effect of freeze-thaw cycles on bacterial communities of arctic tundra soil

.

Microb Ecol

58

:

621

631

.

(

2011

)

Terriglobus saanensis sp. nov., an Acidobacterium isolated from tundra soil

.

Int J Syst Evol Microbiol

61

:

1823

1828

.

(

2012

)

Granulicella arctica sp. nov., Granulicella mallensis sp. nov., Granulicella sapmiensis sp. nov. and Granulicella tundricola sp. nov., novel Acidobacteria from tundra soil of Northern Finland

.

Int J Syst Evol Microbiol

. doi:

.

et al. (

2005

)

Genome sequencing in microfabricated high-density picolitre reactors

.

Nature

437

:

376

380

.

(

1995

)

Genes required for cellulose synthesis in Agrobacterium tumefaciens

.

J Bacteriol

177

:

1069

1075

.

(

2005

)

Unexpectedly high bacterial diversity in arctic tundra relative to boreal forest soils, revealed by serial analysis of ribosomal sequence tags

.

Appl Environ Microbiol

71

:

5710

5718

.

(

2010

)

Exopolysaccharides from extremophiles: from fundamentals to biotechnology

.

Environ Technol

31

:

1145

1158

.

(

1987

)

A rapid test for chitinase activity that uses 4-methylumbelliferyl-N-acetyl-beta-D-glucosaminide

.

Appl Environ Microbiol

53

:

1718

1720

.

(

2008

)

Substrate-induced growth and isolation of Acidobacteria from acidic Sphagnum peat

.

ISME J

2

:

551

560

.

(

2011

)

Bacterial populations and environmental factors controlling cellulose degradation in an acidic Sphagnum peat

.

Environ Microbiol

13

:

1800

1814

.

(

2010

)

GenePRIMP: a gene prediction improvement pipeline for prokaryotic genomes

.

Nat Methods

7

:

455

457

.

(

2009

)

Glycogen Biosynthesis

.

Elsevier

,

Oxford

.

(

1996

)

The biochemistry and genetics of capsular polysaccharide production in bacteria

.

Annu Rev Microbiol

50

:

285

315

.

(

2002

)

Molecular biology of cellulose production in bacteria

.

Res Microbiol

153

:

205

212

.

(

1991

)

Cellulose biosynthesis and function in bacteria

.

Microbiol Rev

55

:

35

58

.

(

2002

)

Enzyme function less conserved than anticipated

.

J Mol Biol

318

:

595

608

.

(

2002

)

Cultivation of globally distributed soil bacteria from phylogenetic lineages previously only detected in cultivation-independent surveys

.

Environ Microbiol

4

:

654

666

.

(

1991

)

Database of homology-derived protein structures and the structural meaning of sequence alignment

.

Proteins

9

:

56

68

.

(

2003

)

Microbial hemicellulases

.

Curr Opin Microbiol

6

:

219

228

.

(

2011

)

Cellulases: ambiguous nonhomologous enzymes in a genomic perspective

.

Trends Biotechnol

29

:

473

479

.

(

2011

)

MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods

.

Mol Biol Evol

28

:

2731

2739

.

(

1997

)

A genomic perspective on protein families

.

Science

278

:

631

637

.

(

1982

)

Use of Congo red-polysaccharide interactions in enumeration and characterization of cellulolytic bacteria from the bovine rumen

.

Appl Environ Microbiol

43

:

777

780

.

et al. (

2009

)

Three genomes from the phylum Acidobacteria provide insight into the lifestyles of these microorganisms in soils

.

Appl Environ Microbiol

75

:

2046

2056

.

(

1996

)

Microbial hydrolysis of polysaccharides

.

Annu Rev Microbiol

50

:

183

212

.

et al. (

2010

)

Regulation of glycogen metabolism in yeast and bacteria

.

FEMS Microbiol Rev

34

:

952

985

.

(

2008

)

Velvet: algorithms for de novo short read assembly using de Bruijn graphs

.

Genome Res

18

:

821

829

.

Supporting Information

Additional Supporting Information may be found in the online version of the article:

Fig. S1. Circular representation of five plasmids in G. tundricola MP5ACTX9, (a) pACIX901, (b) pACIX902, (c) pACIX903, (d) pACIX904, (e) pACIX905. Fig. S2. Comparison of protein coding sequences (CDSs) encoded in the genomes of six strains of Acidobacteria. At each cutoff (range −5 to 45), we computed the similarity of all six genomes. ‘Random’ baseline curves are represented by Azotobacter vinelandii, Escherichia coli, and Vibrio fischerii (see Materials and methods). Fig. S3. Gene neighborhood display for bacterial cellulose biosynthesis (bcs) operon in the genomes of G. mallensis, G. tundricola, T. saanensis, and A. capsulatum. The gene clusters show genes and locus IDs encoding for cellulose synthase (bscAB), endoglucanase (bscZ), cellulose synthase operon C domain protein (bscC), and YhjQ protein (yhjQ). Genes drawn to scale.

Table S1. Mobile genetic elements in the genomes of G. mallensis, G. tundricola, and T. saanensis.

Table S2. Gene pool shared only by genomes of G. mallensis, G. tundricola, and T. saanensis.

Table S3. Comparison of number of predicted gene modules for major carbohydrate-active enzymes (CAZymes) families, glycoside hydrolases (GHs), glycosyl transferases (GTs), polysaccharide Lyases (PLs), carbohydrate esterases (CEs), and carbohydrate-binding modules (CBMs) in genomes of six strains of Acidobacteria.

Table S4. Predicted genes involved in sugar biosynthesis in genomes of three tundra soil strains G. mallensis, G. tundricola, and T. saanensis compared with A. capsulatum, ‘K. versatilis’ and ‘S. usitatus’.

Please note: Wiley-Blackwell are not responsible for the content or functionality of any supporting materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the article.

Author notes

Editor: Dirk Wagner

© 2012 Federation of European Microbiological Societies