Scott Clingenpeel - Academia.edu (original) (raw)
Papers by Scott Clingenpeel
The ISME Journal
Ammonia-oxidizing archaea (AOA) of the phylum Thaumarchaeota are widespread in marine and terrest... more Ammonia-oxidizing archaea (AOA) of the phylum Thaumarchaeota are widespread in marine and terrestrial habitats, playing a major role in the global nitrogen cycle. However, their evolutionary history remains unexplored, which limits our understanding of their adaptation mechanisms. Here, our comprehensive phylogenomic tree of Thaumarchaeota supports three sequential events: origin of AOA from terrestrial non-AOA ancestors, colonization of the shallow ocean, and expansion to the deep ocean. Careful molecular dating suggests that these events coincided with the Great Oxygenation Event around 2300 million years ago (Mya), and oxygenation of the shallow and deep ocean around 800 and 635-560 Mya, respectively. The first transition was likely enabled by the gain of an aerobic pathway for energy production by ammonia oxidation and biosynthetic pathways for cobalamin and biotin that act as cofactors in aerobic metabolism. The first transition was also accompanied by the loss of dissimilatory nitrate and sulfate reduction, loss of oxygen-sensitive pyruvate oxidoreductase, which reduces pyruvate to acetyl-CoA, and loss of the Wood-Ljungdahl pathway for anaerobic carbon fixation. The second transition involved gain of a K + transporter and of the biosynthetic pathway for ectoine, which may function as an osmoprotectant. The third transition was accompanied by the loss of the uvr system for repairing ultraviolet light-induced DNA lesions. We conclude that oxygen availability drove the terrestrial origin of AOA and their expansion to the photic and dark oceans, and that the stressors encountered during these events were partially overcome by gene acquisitions from Euryarchaeota and Bacteria, among other sources.
Journal of Chemical Health and Safety
Nature genetics, 2018
Plants intimately associate with diverse bacteria. Plant-associated bacteria have ostensibly evol... more Plants intimately associate with diverse bacteria. Plant-associated bacteria have ostensibly evolved genes that enable them to adapt to plant environments. However, the identities of such genes are mostly unknown, and their functions are poorly characterized. We sequenced 484 genomes of bacterial isolates from roots of Brassicaceae, poplar, and maize. We then compared 3,837 bacterial genomes to identify thousands of plant-associated gene clusters. Genomes of plant-associated bacteria encode more carbohydrate metabolism functions and fewer mobile elements than related non-plant-associated genomes do. We experimentally validated candidates from two sets of plant-associated genes: one involved in plant colonization, and the other serving in microbe-microbe competition between plant-associated bacteria. We also identified 64 plant-associated protein domains that potentially mimic plant domains; some are shared with plant-associated fungi and oomycetes. This work expands the genome-based...
Single cell genomics is a powerful and increasingly popular tool for studying the genetic make-up... more Single cell genomics is a powerful and increasingly popular tool for studying the genetic make-up of uncultured microbes. A key challenge for successful single cell sequencing and analysis is the removal of exogenous DNA from whole genome amplification reagents. We found that UV irradiation of the multiple displacement amplification (MDA) reagents, including the Phi29 polymerase and random hexamer primers, effectively eliminates the amplification of contaminating DNA. The methodology is quick, simple, and highly effective, thus significantly improving whole genome amplification from single cells.
Frontiers in microbiology, 2016
Yellowstone Lake, the largest subalpine lake in the United States, harbors great novelty and dive... more Yellowstone Lake, the largest subalpine lake in the United States, harbors great novelty and diversity of Bacteria and Archaea. Size-fractionated water samples (0.1-0.8, 0.8-3.0, and 3.0-20 μm) were collected from surface photic zone, deep mixing zone, and vent fluids at different locations in the lake by using a remotely operated vehicle (ROV). Quantification with real-time PCR indicated that Bacteria dominated free-living microorganisms with Bacteria/Archaea ratios ranging from 4037:1 (surface water) to 25:1 (vent water). Microbial population structures (both Bacteria and Archaea) were assessed using 454-FLX sequencing with a total of 662,302 pyrosequencing reads for V1 and V2 regions of 16S rRNA genes. Non-metric multidimensional scaling (NMDS) analyses indicated that strong spatial distribution patterns existed from surface to deep vents for free-living Archaea and Bacteria in the lake. Along with pH, major vent-associated geochemical constituents including CH4, CO2, H2, DIC (di...
New Phytologist, 2015
Desert plants are hypothesized to survive the environmental stress inherent to these regions in p... more Desert plants are hypothesized to survive the environmental stress inherent to these regions in part thanks to symbioses with microorganisms, and yet these microbial species, the communities they form, and the forces that influence them are poorly understood. Here we report the first comprehensive investigation of the microbial communities associated with species of Agave, which are native to semiarid and arid regions of Central and North America and are emerging as biofuel feedstocks. We examined prokaryotic and fungal communities in the rhizosphere, phyllosphere, leaf and root endosphere, as well as proximal and distal soil samples from cultivated and native agaves, through Illumina amplicon sequencing. Phylogenetic profiling revealed that the composition of prokaryotic communities was primarily determined by the plant compartment, whereas the composition of fungal communities was mainly influenced by the biogeography of the host species. Cultivated A. tequilana exhibited lower levels of prokaryotic diversity compared with native agaves, although no differences in microbial diversity were found in the endosphere. Agaves shared core prokaryotic and fungal taxa known to promote plant growth and confer tolerance to abiotic stress, which suggests common principles underpinning Agavemicrobe interactions.
The ISME journal, Jan 9, 2015
Single amplified genomes and genomes assembled from metagenomes have enabled the exploration of u... more Single amplified genomes and genomes assembled from metagenomes have enabled the exploration of uncultured microorganisms at an unprecedented scale. However, both these types of products are plagued by contamination. Since these genomes are now being generated in a high-throughput manner and sequences from them are propagating into public databases to drive novel scientific discoveries, rigorous quality controls and decontamination protocols are urgently needed. Here, we present ProDeGe (Protocol for fully automated Decontamination of Genomes), the first computational protocol for fully automated decontamination of draft genomes. ProDeGe classifies sequences into two classes-clean and contaminant-using a combination of homology and feature-based methodologies. On average, 84% of sequence from the non-target organism is removed from the data set (specificity) and 84% of the sequence from the target organism is retained (sensitivity). The procedure operates successfully at a rate of ~...
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2013
Recent advances in single-cell genomics provide an alternative to largely gene-centric metagenomi... more Recent advances in single-cell genomics provide an alternative to largely gene-centric metagenomics studies, enabling whole-genome sequencing of uncultivated bacteria. However, single-cell assembly projects are challenging due to (i) the highly nonuniform read coverage and (ii) a greatly elevated number of chimeric reads and read pairs. While recently developed single-cell assemblers have addressed the former challenge, methods for assembling highly chimeric reads remain poorly explored. We present algorithms for identifying chimeric edges and resolving complex bulges in de Bruijn graphs, which significantly improve single-cell assemblies. We further describe applications of the single-cell assembler SPAdes to a new approach for capturing and sequencing ''microbial dark matter'' that forms small pools of randomly selected single cells (called a mini-metagenome) and further sequences all genomes from the mini-metagenome at once. On single-cell bacterial datasets, SPAdes improves on the recently developed E + V-SC and IDBA-UD assemblers specifically designed for single-cell sequencing. For standard (cultivated monostrain) datasets, SPAdes also improves on A5, ABySS, CLC, EULER-SR, Ray, SOAPdenovo, and Velvet. Thus, recently developed single-cell assemblers not only enable single-cell sequencing, but also improve on conventional assemblers on their own turf. SPAdes is available for free online download under a GPLv2 license.
Journal of Computational Biology, 2013
Recent advances in single-cell genomics provide an alternative to largely gene-centric metagenomi... more Recent advances in single-cell genomics provide an alternative to largely gene-centric metagenomics studies, enabling whole-genome sequencing of uncultivated bacteria. However, single-cell assembly projects are challenging due to (i) the highly nonuniform read coverage and (ii) a greatly elevated number of chimeric reads and read pairs. While recently developed single-cell assemblers have addressed the former challenge, methods for assembling highly chimeric reads remain poorly explored. We present algorithms for identifying chimeric edges and resolving complex bulges in de Bruijn graphs, which significantly improve single-cell assemblies. We further describe applications of the single-cell assembler SPAdes to a new approach for capturing and sequencing ''microbial dark matter'' that forms small pools of randomly selected single cells (called a mini-metagenome) and further sequences all genomes from the mini-metagenome at once. On single-cell bacterial datasets, SPAdes improves on the recently developed E + V-SC and IDBA-UD assemblers specifically designed for single-cell sequencing. For standard (cultivated monostrain) datasets, SPAdes also improves on A5, ABySS, CLC, EULER-SR, Ray, SOAPdenovo, and Velvet. Thus, recently developed single-cell assemblers not only enable single-cell sequencing, but also improve on conventional assemblers on their own turf. SPAdes is available for free online download under a GPLv2 license.
Frontiers in Microbiology, 2015
Yellowstone Lake (Yellowstone National Park, WY, USA) is a large high-altitude (2200 m), fresh-wa... more Yellowstone Lake (Yellowstone National Park, WY, USA) is a large high-altitude (2200 m), fresh-water lake, which straddles an extensive caldera and is the center of significant geothermal activity. The primary goal of this interdisciplinary study was to evaluate the microbial populations inhabiting thermal vent communities in Yellowstone Lake using 16S rRNA gene and random metagenome sequencing, and to determine how geochemical attributes of vent waters influence the distribution of specific microorganisms and their metabolic potential. Thermal vent waters and associated microbial biomass were sampled during two field seasons (2007–2008) using a remotely operated vehicle (ROV). Sublacustrine thermal vent waters (circa 50–90°C) contained elevated concentrations of numerous constituents associated with geothermal activity including dissolved hydrogen, sulfide, methane and carbon dioxide. Microorganisms associated with sulfur-rich filamentous “streamer” communities of Inflated Plain an...
Single cell genomics, the amplification and sequencing of genomes from single cells, can provide ... more Single cell genomics, the amplification and sequencing of genomes from single cells, can provide a glimpse into the genetic make-up and thus life style of the vast majority of uncultured microbial cells, making it an immensely powerful and increasingly popular tool. This is accomplished by use of multiple displacement amplification (MDA), which can generate billions of copies of a single
Yellowstone Lake (Yellowstone National Park, WY, USA) is a large, high-altitude, fresh-water lake... more Yellowstone Lake (Yellowstone National Park, WY, USA) is a large, high-altitude, fresh-water lake that straddles the most recent Yellowstone caldera, and is situated on top of significant hydrothermal activity. An interdisciplinary study is underway to evaluate the geochemical and geomicrobiological characteristics of several hydrothermal vent environments sampled using a remotely operated vehicle, and to determine the degree to which these
Frontiers in Microbiology, 2013
Considerable Nanoarchaeota novelty and diversity were encountered in Yellowstone Lake, Yellowston... more Considerable Nanoarchaeota novelty and diversity were encountered in Yellowstone Lake, Yellowstone National Park (YNP), where sampling targeted lake floor hydrothermal vent fluids, streamers and sediments associated with these vents, and in planktonic photic zones in three different regions of the lake. Significant homonucleotide repeats (HR) were observed in pyrosequence reads and in near full-length Sanger sequences, averaging 112 HR per 1349 bp clone and could confound diversity estimates derived from pyrosequencing, resulting in false nucleotide insertions or deletions (indels). However, Sanger sequencing of two different sets of PCR clones (110 bp, 1349 bp) demonstrated that at least some of these indels are real. The majority of the Nanoarchaeota PCR amplicons were vent associated; however, curiously, one relatively small Nanoarchaeota OTU (71 pyrosequencing reads) was only found in photic zone water samples obtained from a region of the lake furthest removed from the hydrothermal regions of the lake. Extensive pyrosequencing failed to demonstrate the presence of an Ignicoccus lineage in this lake, suggesting the Nanoarchaeota in this environment are associated with novel Archaea hosts. Defined phylogroups based on near full-length PCR clones document the significant Nanoarchaeota 16S rRNA gene diversity in this lake and firmly establish a terrestrial clade distinct from the marine Nanoarcheota as well as from other geographical locations.
Frontiers in Microbiology, 2012
One difficulty in using bioremediation at a contaminated site is demonstrating that biodegradatio... more One difficulty in using bioremediation at a contaminated site is demonstrating that biodegradation is actually occurring in situ. The stable isotope composition of contaminants may help with this, since they can serve as an indicator of biological activity.To use this approach it is necessary to establish how a particular biodegradation pathway affects the isotopic composition of a contaminant.This study examined bacterial strains expressing three aerobic enzymes for their effect on the 13 C/ 12 C ratio when degrading both trichloroethene (TCE) and cis-1,2-dichloroethene (c-DCE): toluene 3-monoxygenase, toluene 4-monooxygenase, and toluene 2,3-dioxygenase. We found no significant differences in fractionation among the three enzymes for either compound. Aerobic degradation of c-DCE occurred with low fractionation producing 13 δ C enrichment factors of −0.9 ± 0.5 to −1.2 ± 0.5, in contrast to reported anaerobic degradation 13 δ C enrichment factors of −14.1 to 13 −20.4‰. Aerobic degradation of TCE resulted in δ C enrichment factors of −11.6 ± 4.1 to −14.7 ± 3.0‰ which overlap reported 13 δ C enrichment factors for anaerobic TCE degradation of −2.5 to −13.8‰. The data from this study suggest that stable isotopes could serve as a diagnostic for detecting aerobic biodegradation of TCE by toluene oxygenases at contaminated sites.
Frontiers in microbiology, 2014
As the vast majority of microorganisms have yet to be cultivated in a laboratory setting, access ... more As the vast majority of microorganisms have yet to be cultivated in a laboratory setting, access to their genetic makeup has largely been limited to cultivation-independent methods. These methods, namely metagenomics and more recently single-cell genomics, have become cornerstones for microbial ecology and environmental microbiology. One ultimate goal is the recovery of genome sequences from each cell within an environment to move toward a better understanding of community metabolic potential and to provide substrate for experimental work. As single-cell sequencing has the ability to decipher all sequence information contained in an individual cell, this method holds great promise in tackling such challenge. Methodological limitations and inherent biases however do exist, which will be discussed here based on environmental and benchmark data, to assess how far we are from reaching this goal.
The ISME Journal, 2011
The Yellowstone geothermal complex has yielded foundational discoveries that have significantly e... more The Yellowstone geothermal complex has yielded foundational discoveries that have significantly enhanced our understanding of the Archaea. This study continues on this theme, examining Yellowstone Lake and its lake floor hydrothermal vents. Significant Archaea novelty and diversity were found associated with two near-surface photic zone environments and two vents that varied in their depth, temperature and geochemical profile. Phylogenetic diversity was assessed using 454-FLX sequencing (B51 000 pyrosequencing reads; V1 and V2 regions) and Sanger sequencing of 200 near-full-length polymerase chain reaction (PCR) clones. Automated classifiers (Ribosomal Database Project (RDP) and Greengenes) were problematic for the 454-FLX reads (wrong domain or phylum), although BLAST analysis of the 454-FLX reads against the phylogenetically placed fulllength Sanger sequenced PCR clones proved reliable. Most of the archaeal diversity was associated with vents, and as expected there were differences between the vents and the near-surface photic zone samples. Thaumarchaeota dominated all samples: vent-associated organisms corresponded to the largely uncharacterized Marine Group I, and in surface waters, B69-84% of the 454-FLX reads matched archaeal clones representing organisms that are Nitrosopumilus maritimus-like (96-97% identity). Importance of the lake nitrogen cycling was also suggested by 45% of the alkaline vent phylotypes being closely related to the nitrifier Candidatus Nitrosocaldus yellowstonii. The Euryarchaeota were primarily related to the uncharacterized environmental clones that make up the Deep Sea Euryarchaeal Group or Deep Sea Hydrothermal Vent Group-6. The phylogenetic parallels of Yellowstone Lake archaea to marine microorganisms provide opportunities to examine interesting evolutionary tracks between freshwater and marine lineages.
PLoS ONE, 2011
Single cell genomics is a powerful and increasingly popular tool for studying the genetic make-up... more Single cell genomics is a powerful and increasingly popular tool for studying the genetic make-up of uncultured microbes. A key challenge for successful single cell sequencing and analysis is the removal of exogenous DNA from whole genome amplification reagents. We found that UV irradiation of the multiple displacement amplification (MDA) reagents, including the Phi29 polymerase and random hexamer primers, effectively eliminates the amplification of contaminating DNA. The methodology is quick, simple, and highly effective, thus significantly improving whole genome amplification from single cells.
Journal of Microbiological Methods, 2003
3-Hydroxyphenylacetylene (3-HPA) served as a novel, activity-dependent, fluorogenic and chromogen... more 3-Hydroxyphenylacetylene (3-HPA) served as a novel, activity-dependent, fluorogenic and chromogenic probe for bacterial enzymes known to degrade toluene via meta ring fission of the intermediate, 3-methylcatechol. By this direct physiological analysis, cells grown with an aromatic substrate to induce the synthesis of toluene-degrading enzymes were fluorescently labeled. D
Journal of Microbiological Methods, 2001
In whole-cell studies, two alkynes, 1-pentyne and phenylacetylene, were selective, irreversible i... more In whole-cell studies, two alkynes, 1-pentyne and phenylacetylene, were selective, irreversible inhibitors of monooxygenase enzymes in catabolic pathways that permit growth of bacteria on toluene. 1-Pentyne selectively inhibited growth of Ž w x . Ž Burkholderia cepacia G4 toluene 2-monooxygenase T2MO pathway and B. pickettii PKO1 toluene 3-monooxygenase w
Journal of microbiological methods, 2005
3-Ethynylbenzoate (3EB) functions as a novel, activity-dependent, fluorogenic, and chromogenic pr... more 3-Ethynylbenzoate (3EB) functions as a novel, activity-dependent, fluorogenic, and chromogenic probe for bacterial strains expressing the TOL pathway, which degrade toluene via conversion to benzoate, followed by meta ring fission of the intermediate catechol. This direct physiological analysis allows the fluorescent labeling of cells whose toluene-degrading enzymes have been induced by an aromatic substrate.
The ISME Journal
Ammonia-oxidizing archaea (AOA) of the phylum Thaumarchaeota are widespread in marine and terrest... more Ammonia-oxidizing archaea (AOA) of the phylum Thaumarchaeota are widespread in marine and terrestrial habitats, playing a major role in the global nitrogen cycle. However, their evolutionary history remains unexplored, which limits our understanding of their adaptation mechanisms. Here, our comprehensive phylogenomic tree of Thaumarchaeota supports three sequential events: origin of AOA from terrestrial non-AOA ancestors, colonization of the shallow ocean, and expansion to the deep ocean. Careful molecular dating suggests that these events coincided with the Great Oxygenation Event around 2300 million years ago (Mya), and oxygenation of the shallow and deep ocean around 800 and 635-560 Mya, respectively. The first transition was likely enabled by the gain of an aerobic pathway for energy production by ammonia oxidation and biosynthetic pathways for cobalamin and biotin that act as cofactors in aerobic metabolism. The first transition was also accompanied by the loss of dissimilatory nitrate and sulfate reduction, loss of oxygen-sensitive pyruvate oxidoreductase, which reduces pyruvate to acetyl-CoA, and loss of the Wood-Ljungdahl pathway for anaerobic carbon fixation. The second transition involved gain of a K + transporter and of the biosynthetic pathway for ectoine, which may function as an osmoprotectant. The third transition was accompanied by the loss of the uvr system for repairing ultraviolet light-induced DNA lesions. We conclude that oxygen availability drove the terrestrial origin of AOA and their expansion to the photic and dark oceans, and that the stressors encountered during these events were partially overcome by gene acquisitions from Euryarchaeota and Bacteria, among other sources.
Journal of Chemical Health and Safety
Nature genetics, 2018
Plants intimately associate with diverse bacteria. Plant-associated bacteria have ostensibly evol... more Plants intimately associate with diverse bacteria. Plant-associated bacteria have ostensibly evolved genes that enable them to adapt to plant environments. However, the identities of such genes are mostly unknown, and their functions are poorly characterized. We sequenced 484 genomes of bacterial isolates from roots of Brassicaceae, poplar, and maize. We then compared 3,837 bacterial genomes to identify thousands of plant-associated gene clusters. Genomes of plant-associated bacteria encode more carbohydrate metabolism functions and fewer mobile elements than related non-plant-associated genomes do. We experimentally validated candidates from two sets of plant-associated genes: one involved in plant colonization, and the other serving in microbe-microbe competition between plant-associated bacteria. We also identified 64 plant-associated protein domains that potentially mimic plant domains; some are shared with plant-associated fungi and oomycetes. This work expands the genome-based...
Single cell genomics is a powerful and increasingly popular tool for studying the genetic make-up... more Single cell genomics is a powerful and increasingly popular tool for studying the genetic make-up of uncultured microbes. A key challenge for successful single cell sequencing and analysis is the removal of exogenous DNA from whole genome amplification reagents. We found that UV irradiation of the multiple displacement amplification (MDA) reagents, including the Phi29 polymerase and random hexamer primers, effectively eliminates the amplification of contaminating DNA. The methodology is quick, simple, and highly effective, thus significantly improving whole genome amplification from single cells.
Frontiers in microbiology, 2016
Yellowstone Lake, the largest subalpine lake in the United States, harbors great novelty and dive... more Yellowstone Lake, the largest subalpine lake in the United States, harbors great novelty and diversity of Bacteria and Archaea. Size-fractionated water samples (0.1-0.8, 0.8-3.0, and 3.0-20 μm) were collected from surface photic zone, deep mixing zone, and vent fluids at different locations in the lake by using a remotely operated vehicle (ROV). Quantification with real-time PCR indicated that Bacteria dominated free-living microorganisms with Bacteria/Archaea ratios ranging from 4037:1 (surface water) to 25:1 (vent water). Microbial population structures (both Bacteria and Archaea) were assessed using 454-FLX sequencing with a total of 662,302 pyrosequencing reads for V1 and V2 regions of 16S rRNA genes. Non-metric multidimensional scaling (NMDS) analyses indicated that strong spatial distribution patterns existed from surface to deep vents for free-living Archaea and Bacteria in the lake. Along with pH, major vent-associated geochemical constituents including CH4, CO2, H2, DIC (di...
New Phytologist, 2015
Desert plants are hypothesized to survive the environmental stress inherent to these regions in p... more Desert plants are hypothesized to survive the environmental stress inherent to these regions in part thanks to symbioses with microorganisms, and yet these microbial species, the communities they form, and the forces that influence them are poorly understood. Here we report the first comprehensive investigation of the microbial communities associated with species of Agave, which are native to semiarid and arid regions of Central and North America and are emerging as biofuel feedstocks. We examined prokaryotic and fungal communities in the rhizosphere, phyllosphere, leaf and root endosphere, as well as proximal and distal soil samples from cultivated and native agaves, through Illumina amplicon sequencing. Phylogenetic profiling revealed that the composition of prokaryotic communities was primarily determined by the plant compartment, whereas the composition of fungal communities was mainly influenced by the biogeography of the host species. Cultivated A. tequilana exhibited lower levels of prokaryotic diversity compared with native agaves, although no differences in microbial diversity were found in the endosphere. Agaves shared core prokaryotic and fungal taxa known to promote plant growth and confer tolerance to abiotic stress, which suggests common principles underpinning Agavemicrobe interactions.
The ISME journal, Jan 9, 2015
Single amplified genomes and genomes assembled from metagenomes have enabled the exploration of u... more Single amplified genomes and genomes assembled from metagenomes have enabled the exploration of uncultured microorganisms at an unprecedented scale. However, both these types of products are plagued by contamination. Since these genomes are now being generated in a high-throughput manner and sequences from them are propagating into public databases to drive novel scientific discoveries, rigorous quality controls and decontamination protocols are urgently needed. Here, we present ProDeGe (Protocol for fully automated Decontamination of Genomes), the first computational protocol for fully automated decontamination of draft genomes. ProDeGe classifies sequences into two classes-clean and contaminant-using a combination of homology and feature-based methodologies. On average, 84% of sequence from the non-target organism is removed from the data set (specificity) and 84% of the sequence from the target organism is retained (sensitivity). The procedure operates successfully at a rate of ~...
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2013
Recent advances in single-cell genomics provide an alternative to largely gene-centric metagenomi... more Recent advances in single-cell genomics provide an alternative to largely gene-centric metagenomics studies, enabling whole-genome sequencing of uncultivated bacteria. However, single-cell assembly projects are challenging due to (i) the highly nonuniform read coverage and (ii) a greatly elevated number of chimeric reads and read pairs. While recently developed single-cell assemblers have addressed the former challenge, methods for assembling highly chimeric reads remain poorly explored. We present algorithms for identifying chimeric edges and resolving complex bulges in de Bruijn graphs, which significantly improve single-cell assemblies. We further describe applications of the single-cell assembler SPAdes to a new approach for capturing and sequencing ''microbial dark matter'' that forms small pools of randomly selected single cells (called a mini-metagenome) and further sequences all genomes from the mini-metagenome at once. On single-cell bacterial datasets, SPAdes improves on the recently developed E + V-SC and IDBA-UD assemblers specifically designed for single-cell sequencing. For standard (cultivated monostrain) datasets, SPAdes also improves on A5, ABySS, CLC, EULER-SR, Ray, SOAPdenovo, and Velvet. Thus, recently developed single-cell assemblers not only enable single-cell sequencing, but also improve on conventional assemblers on their own turf. SPAdes is available for free online download under a GPLv2 license.
Journal of Computational Biology, 2013
Recent advances in single-cell genomics provide an alternative to largely gene-centric metagenomi... more Recent advances in single-cell genomics provide an alternative to largely gene-centric metagenomics studies, enabling whole-genome sequencing of uncultivated bacteria. However, single-cell assembly projects are challenging due to (i) the highly nonuniform read coverage and (ii) a greatly elevated number of chimeric reads and read pairs. While recently developed single-cell assemblers have addressed the former challenge, methods for assembling highly chimeric reads remain poorly explored. We present algorithms for identifying chimeric edges and resolving complex bulges in de Bruijn graphs, which significantly improve single-cell assemblies. We further describe applications of the single-cell assembler SPAdes to a new approach for capturing and sequencing ''microbial dark matter'' that forms small pools of randomly selected single cells (called a mini-metagenome) and further sequences all genomes from the mini-metagenome at once. On single-cell bacterial datasets, SPAdes improves on the recently developed E + V-SC and IDBA-UD assemblers specifically designed for single-cell sequencing. For standard (cultivated monostrain) datasets, SPAdes also improves on A5, ABySS, CLC, EULER-SR, Ray, SOAPdenovo, and Velvet. Thus, recently developed single-cell assemblers not only enable single-cell sequencing, but also improve on conventional assemblers on their own turf. SPAdes is available for free online download under a GPLv2 license.
Frontiers in Microbiology, 2015
Yellowstone Lake (Yellowstone National Park, WY, USA) is a large high-altitude (2200 m), fresh-wa... more Yellowstone Lake (Yellowstone National Park, WY, USA) is a large high-altitude (2200 m), fresh-water lake, which straddles an extensive caldera and is the center of significant geothermal activity. The primary goal of this interdisciplinary study was to evaluate the microbial populations inhabiting thermal vent communities in Yellowstone Lake using 16S rRNA gene and random metagenome sequencing, and to determine how geochemical attributes of vent waters influence the distribution of specific microorganisms and their metabolic potential. Thermal vent waters and associated microbial biomass were sampled during two field seasons (2007–2008) using a remotely operated vehicle (ROV). Sublacustrine thermal vent waters (circa 50–90°C) contained elevated concentrations of numerous constituents associated with geothermal activity including dissolved hydrogen, sulfide, methane and carbon dioxide. Microorganisms associated with sulfur-rich filamentous “streamer” communities of Inflated Plain an...
Single cell genomics, the amplification and sequencing of genomes from single cells, can provide ... more Single cell genomics, the amplification and sequencing of genomes from single cells, can provide a glimpse into the genetic make-up and thus life style of the vast majority of uncultured microbial cells, making it an immensely powerful and increasingly popular tool. This is accomplished by use of multiple displacement amplification (MDA), which can generate billions of copies of a single
Yellowstone Lake (Yellowstone National Park, WY, USA) is a large, high-altitude, fresh-water lake... more Yellowstone Lake (Yellowstone National Park, WY, USA) is a large, high-altitude, fresh-water lake that straddles the most recent Yellowstone caldera, and is situated on top of significant hydrothermal activity. An interdisciplinary study is underway to evaluate the geochemical and geomicrobiological characteristics of several hydrothermal vent environments sampled using a remotely operated vehicle, and to determine the degree to which these
Frontiers in Microbiology, 2013
Considerable Nanoarchaeota novelty and diversity were encountered in Yellowstone Lake, Yellowston... more Considerable Nanoarchaeota novelty and diversity were encountered in Yellowstone Lake, Yellowstone National Park (YNP), where sampling targeted lake floor hydrothermal vent fluids, streamers and sediments associated with these vents, and in planktonic photic zones in three different regions of the lake. Significant homonucleotide repeats (HR) were observed in pyrosequence reads and in near full-length Sanger sequences, averaging 112 HR per 1349 bp clone and could confound diversity estimates derived from pyrosequencing, resulting in false nucleotide insertions or deletions (indels). However, Sanger sequencing of two different sets of PCR clones (110 bp, 1349 bp) demonstrated that at least some of these indels are real. The majority of the Nanoarchaeota PCR amplicons were vent associated; however, curiously, one relatively small Nanoarchaeota OTU (71 pyrosequencing reads) was only found in photic zone water samples obtained from a region of the lake furthest removed from the hydrothermal regions of the lake. Extensive pyrosequencing failed to demonstrate the presence of an Ignicoccus lineage in this lake, suggesting the Nanoarchaeota in this environment are associated with novel Archaea hosts. Defined phylogroups based on near full-length PCR clones document the significant Nanoarchaeota 16S rRNA gene diversity in this lake and firmly establish a terrestrial clade distinct from the marine Nanoarcheota as well as from other geographical locations.
Frontiers in Microbiology, 2012
One difficulty in using bioremediation at a contaminated site is demonstrating that biodegradatio... more One difficulty in using bioremediation at a contaminated site is demonstrating that biodegradation is actually occurring in situ. The stable isotope composition of contaminants may help with this, since they can serve as an indicator of biological activity.To use this approach it is necessary to establish how a particular biodegradation pathway affects the isotopic composition of a contaminant.This study examined bacterial strains expressing three aerobic enzymes for their effect on the 13 C/ 12 C ratio when degrading both trichloroethene (TCE) and cis-1,2-dichloroethene (c-DCE): toluene 3-monoxygenase, toluene 4-monooxygenase, and toluene 2,3-dioxygenase. We found no significant differences in fractionation among the three enzymes for either compound. Aerobic degradation of c-DCE occurred with low fractionation producing 13 δ C enrichment factors of −0.9 ± 0.5 to −1.2 ± 0.5, in contrast to reported anaerobic degradation 13 δ C enrichment factors of −14.1 to 13 −20.4‰. Aerobic degradation of TCE resulted in δ C enrichment factors of −11.6 ± 4.1 to −14.7 ± 3.0‰ which overlap reported 13 δ C enrichment factors for anaerobic TCE degradation of −2.5 to −13.8‰. The data from this study suggest that stable isotopes could serve as a diagnostic for detecting aerobic biodegradation of TCE by toluene oxygenases at contaminated sites.
Frontiers in microbiology, 2014
As the vast majority of microorganisms have yet to be cultivated in a laboratory setting, access ... more As the vast majority of microorganisms have yet to be cultivated in a laboratory setting, access to their genetic makeup has largely been limited to cultivation-independent methods. These methods, namely metagenomics and more recently single-cell genomics, have become cornerstones for microbial ecology and environmental microbiology. One ultimate goal is the recovery of genome sequences from each cell within an environment to move toward a better understanding of community metabolic potential and to provide substrate for experimental work. As single-cell sequencing has the ability to decipher all sequence information contained in an individual cell, this method holds great promise in tackling such challenge. Methodological limitations and inherent biases however do exist, which will be discussed here based on environmental and benchmark data, to assess how far we are from reaching this goal.
The ISME Journal, 2011
The Yellowstone geothermal complex has yielded foundational discoveries that have significantly e... more The Yellowstone geothermal complex has yielded foundational discoveries that have significantly enhanced our understanding of the Archaea. This study continues on this theme, examining Yellowstone Lake and its lake floor hydrothermal vents. Significant Archaea novelty and diversity were found associated with two near-surface photic zone environments and two vents that varied in their depth, temperature and geochemical profile. Phylogenetic diversity was assessed using 454-FLX sequencing (B51 000 pyrosequencing reads; V1 and V2 regions) and Sanger sequencing of 200 near-full-length polymerase chain reaction (PCR) clones. Automated classifiers (Ribosomal Database Project (RDP) and Greengenes) were problematic for the 454-FLX reads (wrong domain or phylum), although BLAST analysis of the 454-FLX reads against the phylogenetically placed fulllength Sanger sequenced PCR clones proved reliable. Most of the archaeal diversity was associated with vents, and as expected there were differences between the vents and the near-surface photic zone samples. Thaumarchaeota dominated all samples: vent-associated organisms corresponded to the largely uncharacterized Marine Group I, and in surface waters, B69-84% of the 454-FLX reads matched archaeal clones representing organisms that are Nitrosopumilus maritimus-like (96-97% identity). Importance of the lake nitrogen cycling was also suggested by 45% of the alkaline vent phylotypes being closely related to the nitrifier Candidatus Nitrosocaldus yellowstonii. The Euryarchaeota were primarily related to the uncharacterized environmental clones that make up the Deep Sea Euryarchaeal Group or Deep Sea Hydrothermal Vent Group-6. The phylogenetic parallels of Yellowstone Lake archaea to marine microorganisms provide opportunities to examine interesting evolutionary tracks between freshwater and marine lineages.
PLoS ONE, 2011
Single cell genomics is a powerful and increasingly popular tool for studying the genetic make-up... more Single cell genomics is a powerful and increasingly popular tool for studying the genetic make-up of uncultured microbes. A key challenge for successful single cell sequencing and analysis is the removal of exogenous DNA from whole genome amplification reagents. We found that UV irradiation of the multiple displacement amplification (MDA) reagents, including the Phi29 polymerase and random hexamer primers, effectively eliminates the amplification of contaminating DNA. The methodology is quick, simple, and highly effective, thus significantly improving whole genome amplification from single cells.
Journal of Microbiological Methods, 2003
3-Hydroxyphenylacetylene (3-HPA) served as a novel, activity-dependent, fluorogenic and chromogen... more 3-Hydroxyphenylacetylene (3-HPA) served as a novel, activity-dependent, fluorogenic and chromogenic probe for bacterial enzymes known to degrade toluene via meta ring fission of the intermediate, 3-methylcatechol. By this direct physiological analysis, cells grown with an aromatic substrate to induce the synthesis of toluene-degrading enzymes were fluorescently labeled. D
Journal of Microbiological Methods, 2001
In whole-cell studies, two alkynes, 1-pentyne and phenylacetylene, were selective, irreversible i... more In whole-cell studies, two alkynes, 1-pentyne and phenylacetylene, were selective, irreversible inhibitors of monooxygenase enzymes in catabolic pathways that permit growth of bacteria on toluene. 1-Pentyne selectively inhibited growth of Ž w x . Ž Burkholderia cepacia G4 toluene 2-monooxygenase T2MO pathway and B. pickettii PKO1 toluene 3-monooxygenase w
Journal of microbiological methods, 2005
3-Ethynylbenzoate (3EB) functions as a novel, activity-dependent, fluorogenic, and chromogenic pr... more 3-Ethynylbenzoate (3EB) functions as a novel, activity-dependent, fluorogenic, and chromogenic probe for bacterial strains expressing the TOL pathway, which degrade toluene via conversion to benzoate, followed by meta ring fission of the intermediate catechol. This direct physiological analysis allows the fluorescent labeling of cells whose toluene-degrading enzymes have been induced by an aromatic substrate.