Hecatomb: An End-to-End Research Platform for Viral Metagenomics (original) (raw)
Related papers
Expanding the Marine Virosphere Using Metagenomics
PLoS Genetics, 2013
Viruses infecting prokaryotic cells (phages) are the most abundant entities of the biosphere and contain a largely uncharted wealth of genomic diversity. They play a critical role in the biology of their hosts and in ecosystem functioning at large. The classical approaches studying phages require isolation from a pure culture of the host. Direct sequencing approaches have been hampered by the small amounts of phage DNA present in most natural habitats and the difficulty in applying metaomic approaches, such as annotation of small reads and assembly. Serendipitously, it has been discovered that cellular metagenomes of highly productive ocean waters (the deep chlorophyll maximum) contain significant amounts of viral DNA derived from cells undergoing the lytic cycle. We have taken advantage of this phenomenon to retrieve metagenomic fosmids containing viral DNA from a Mediterranean deep chlorophyll maximum sample. This method allowed description of complete genomes of 208 new marine phages. The diversity of these genomes was remarkable, contributing 21 genomic groups of tailed bacteriophages of which 10 are completely new. Sequence based methods have allowed host assignment to many of them. These predicted hosts represent a wide variety of important marine prokaryotic microbes like members of SAR11 and SAR116 clades, Cyanobacteria and also the newly described low GC Actinobacteria. A metavirome constructed from the same habitat showed that many of the new phage genomes were abundantly represented. Furthermore, other available metaviromes also indicated that some of the new phages are globally distributed in low to medium latitude ocean waters. The availability of many genomes from the same sample allows a direct approach to viral population genomics confirming the remarkable mosaicism of phage genomes.
Viral Metagenomic Content Reflects Seawater Ecological Quality in the Coastal Zone
Viruses
Viruses interfere with their host’s metabolism through the expression of auxiliary metabolic genes (AMGs) that, until now, are mostly studied under large physicochemical gradients. Here, we focus on coastal marine ecosystems and we sequence the viral metagenome (virome) of samples with discrete levels of human-driven disturbances. We aim to describe the relevance of viromics with respect to ecological quality status, defined by the classic seawater trophic index (TRIX). Neither viral (family level) nor bacterial (family level, based on 16S rRNA sequencing) community structure correlated with TRIX. AMGs involved in the Calvin and tricarboxylic acid cycles were found at stations with poor ecological quality, supporting viral lysis by modifying the host’s energy supply. AMGs involved in “non-traditional” energy-production pathways (3HP, sulfur oxidation) were found irrespective of ecological quality, highlighting the importance of recognizing the prevalent metabolic paths and their int...
Comparative metagenomics of microbial traits within oceanic viral communities
The ISME Journal, 2011
Viral genomes often contain genes recently acquired from microbes. In some cases (for example, psbA) the proteins encoded by these genes have been shown to be important for viral replication. In this study, using a unique search strategy on the Global Ocean Survey (GOS) metagenomes in combination with marine virome and microbiome pyrosequencing-based datasets, we characterize previously undetected microbial metabolic capabilities concealed within the genomes of uncultured marine viral communities. A total of 34 microbial gene families were detected on 452 viral GOS scaffolds. The majority of auxiliary metabolic genes found on these scaffolds have never been reported in phages. Host genes detected in viruses were mainly divided between genes encoding for different energy metabolism pathways, such as electron transport and newly identified photosystem genes, or translation and post-translation mechanism related. Our findings suggest previously undetected ways, in which marine phages adapt to their hosts and improve their fitness, including translation and post-translation level control over the host rather than the already known transcription level control.
ViromeScan: a new tool for metagenomic viral community profiling
BMC Genomics, 2016
Background: Bioinformatics tools available for metagenomic sequencing analysis are principally devoted to the identification of microorganisms populating an ecological niche, but they usually do not consider viruses. Only some software have been designed to profile the viral sequences, however they are not efficient in the characterization of viruses in the context of complex communities, like the intestinal microbiota, containing bacteria, archeabacteria, eukaryotic microorganisms and viruses. In any case, a comprehensive description of the host-microbiota interactions can not ignore the profile of eukaryotic viruses within the virome, as viruses are definitely critical for the regulation of the host immunophenotype. Results: ViromeScan is an innovative metagenomic analysis tool that characterizes the taxonomy of the virome directly from raw data of next-generation sequencing. The tool uses hierarchical databases for eukaryotic viruses to unambiguously assign reads to viral species more accurately and >1000 fold faster than other existing approaches. We validated ViromeScan on synthetic microbial communities and applied it on metagenomic samples of the Human Microbiome Project, providing a sensitive eukaryotic virome profiling of different human body sites. Conclusions: ViromeScan allows the user to explore and taxonomically characterize the virome from metagenomic reads, efficiently denoising samples from reads of other microorganisms. This implies that users can fully characterize the microbiome, including bacteria and viruses, by shotgun metagenomic sequencing followed by different bioinformatic pipelines.
Environmental Microbiology, 2015
Reef-building corals form close associations with organisms from all three domains of life and therefore have many potential viral hosts. Yet knowledge of viral communities associated with corals is barely explored. This complexity presents a number of challenges in terms of the metagenomic assessments of coral viral communities and requires specialized methods for purification and amplification of viral nucleic acids, as well as virome annotation. In this minireview, we conduct a meta-analysis of the limited number of existing coral virome studies, as well as available coral transcriptome and metagenome data, to identify trends and potential complications inherent in different methods. The analysis shows that the method used for viral nucleic acid isolation drastically affects the observed viral assemblage and interpretation of the results. Further, the small number of viral reference genomes available, coupled with short sequence read lengths might cause errors in virus identification. Despite these limitations and potential biases, the data show that viral communities associated with corals are diverse, with double-and singlestranded DNA and RNA viruses. The identified viruses are dominated by double-stranded DNA-tailed bacteriophages, but there are also viruses that infect eukaryote hosts, likely the endosymbiotic dinoflagellates, Symbiodinium spp., host coral and other eukaryotes in close association.
Environmental Microbiology, 2013
Viruses have global impact through mortality, nutrient cycling and horizontal gene transfer, yet their study is limited by complex methodologies with little validation. Here, we use triplicate metagenomes to compare common aquatic viral concentration and purification methods across four combinations as follows: (i) tangential flow filtration (TFF) and DNase + CsCl, (ii) FeCl 3 precipitation and DNase, (iii) FeCl3 precipitation and DNase + CsCl and (iv) FeCl3 precipitation and DNase + sucrose. Taxonomic data (30% of reads) suggested that purification methods were statistically indistinguishable at any taxonomic level while concentration methods were significantly different at family and genus levels. Specifically, TFFconcentrated viral metagenomes had significantly fewer abundant viral types (Podoviridae and Phycodnaviridae) and more variability among Myoviridae than FeCl3-precipitated viral metagenomes. More comprehensive analyses using protein clusters (66% of reads) and k-mers (100% of reads) showed 50-53% of these data were common to all four methods, and revealed trace bacterial DNA contamination in TFFconcentrated metagenomes and one of three replicates concentrated using FeCl3 and purified by DNase alone. Shared k-mer analyses also revealed that polymerases used in amplification impact the resulting metagenomes, with TaKaRa enriching for 'rare' reads relative to PfuTurbo. Together these results provide empirical data for making experimental design decisions in culture-independent viral ecology studies.
Challenges in the analysis of viral metagenomes
Genome sequencing technologies continue to develop with remarkable pace, yet analytical approaches for reconstructing and classifying viral genomes from mixed samples remain limited in their performance and usability. Existing solutions generally target expert users and often have unclear scope, making it challenging to critically evaluate their performance. There is a growing need for intuitive analytical tooling for researchers lacking specialist computing expertise and that is applicable in diverse experimental circumstances. Notable technical challenges have impeded progress; for example, fragments of viral genomes are typically orders of magnitude less abundant than those of host, bacteria, and/or other organisms in clinical and environmental metagenomes; observed viral genomes often deviate considerably from reference genomes demanding use of exhaustive alignment approaches; high intrapopulation viral diversity can lead to ambiguous sequence reconstruction; and finally, the relatively few documented viral reference genomes compared to the estimated number of distinct viral taxa renders classification problematic. Various software tools have been developed to accommodate the unique challenges and use cases associated with characterizing viral sequences; however, the quality of these tools varies, and their use often necessitates computing expertise or access to powerful computers, thus limiting their usefulness to many researchers. In this review, we consider the general and application-specific challenges posed by viral sequencing and analysis, outline the landscape of available tools and methodologies, and propose ways of overcoming the current barriers to effective analysis.
Scientific Reports, 2017
Viruses are the most abundant and, likely, one of the most diverse biological components in the oceans. By infecting their hosts, they play key roles in biogeochemical cycles and ecosystem functioning at a global scale. The ocean interior hosts most of the microbial life, and, despite deep-sea sediments represent the main repository of this component and the largest biome on Earth, viral diversity in these ecosystems remains almost completely unknown. We compared a physical-chemical procedure and a previously published sediment washing-based procedure for isolating viruses from benthic deep-sea ecosystems to generate viromes through high-throughput sequencing. The procedure based on a physical-chemical dislodgment of viral particles from the sediments, followed by vacuum filtration was much more efficient allowing us to recover >85% of the extractable viruses. By using this procedure, a high fraction of viral DNA was recovered and new viromes from different benthic deep-sea sites...