Clinical metagenomics - PubMed (original) (raw)
Review
Clinical metagenomics
Charles Y Chiu et al. Nat Rev Genet. 2019 Jun.
Abstract
Clinical metagenomic next-generation sequencing (mNGS), the comprehensive analysis of microbial and host genetic material (DNA and RNA) in samples from patients, is rapidly moving from research to clinical laboratories. This emerging approach is changing how physicians diagnose and treat infectious disease, with applications spanning a wide range of areas, including antimicrobial resistance, the microbiome, human host gene expression (transcriptomics) and oncology. Here, we focus on the challenges of implementing mNGS in the clinical laboratory and address potential solutions for maximizing its impact on patient care and public health.
Conflict of interest statement
C.Y.C. is the director of the UCSF–Abbott Viral Diagnostics and Discovery Center (VDDC) and receives research support from Abbott Laboratories. C.Y.C. and S.A.M. are inventors on a patent application on algorithms related to SURPI+ software titled ‘Pathogen Detection using Next-Generation Sequencing’ (PCT/US/16/52912).
Figures
Fig. 1. Clinical applications of metagenomic sequencing.
A | Applications in infectious disease diagnostics include direct identification of microorganisms from primary clinical samples (part Aa); antimicrobial resistance prediction by characterization of resistance genes (part Ab); detection of species-level or strain-level virulence determinants, such as secretion of specific endotoxins or exotoxins (part Ac); and antiviral resistance prediction (part Ad). As shown for HIV-1, recovery of the complete viral genome from a patient sample by metagenomic next-generation sequencing (mNGS) (part Ad, graph) facilitates sequence analysis to predict susceptibility or resistance to antiretroviral drugs (part Ad, bar plot); the susceptibility profile for the analysed strain (black bars) predicts resistance to the non-nucleoside reverse transcriptase inhibitor (NNRTI) class of drugs (denoted by an asterisk), as opposed to nucleoside reverse transcriptase inhibitors (NRTIs) or protease inhibitors (PIs). B | Microbiome analyses can inform disease prognosis in acute and chronic disease states and underlie the development of probiotic therapies. Coloured bars represent individual microbiota species. A reduction in species diversity is seen in dysbiosis (an unhealthy state), such as present in patients with _Clostridium difficile_-associated disease. Stool from healthy individuals can be harvested to treat patients with C. difficile infection by faecal stool transplantation or as orally administered encapsulated faecal pills. Alternatively, synthetic stool generated from microbiota species observed in healthy individuals can be used as probiotics to treat patients. In addition to C. difficile infection, chronic diseases such as obesity, inflammatory bowel disease and diabetes mellitus are potential targets for probiotic therapy. C | RNA-sequencing-based transcriptomics can improve the diagnosis of infectious and non-infectious conditions on the basis of the human host response. Host transcriptomic profiling by NGS can enable the construction of a classifier metric to discriminate between patients with infection (red bars) from uninfected patients (blue bars) with high accuracy (part Ca). Metric scores above the dotted line indicate infection, whereas scores below the dotted line indicate absence of infection; the overall accuracy of the classifier metric shown is 83%. Cluster heat map analysis identifies individual, differentially expressed host genes associated with infection (genes A–F) versus those associated with no infection (genes G–L) (part Cb). D | Sequencing of viral tumours or liquid biopsy analyses in oncology can be used for simultaneous pathogen detection and characterization of host genetic mutations. mNGS can be used to detect Merkel cell polyomavirus, the virus associated with the development of Merkel cell carcinoma. Simultaneous sequencing of host DNA can identify mutations that arise from integration of the viral genome containing the full-length large T antigen (LT) followed by subsequent truncation of the LT antigen (part Da) or truncation of the LT antigen before viral genome integration (part Db). Both of these two mutations lead to cellular transformation that drives tumour proliferation. Although promising, many of these sequencing-based applications have yet to be incorporated into routine clinical practice. Part C is adapted from ref., CC BY-NC-ND 4.0 (
https://creativecommons.org/licenses/by-nc-nd/4.0/
). Part D is adapted from ref., CC BY 3.0 (
https://creativecommons.org/licenses/by/3.0/
).
Fig. 2. Targeted versus untargeted shotgun metagenomic next-generation sequencing approaches.
A variety of patient samples, as well as cultured microbial colonies, can be analysed using targeted or untargeted metagenomic next-generation sequencing (mNGS) methods for pathogen identification, microbiome analyses and/or host transcriptome profiling. Universal PCR (left) is a targeted mNGS approach that uses primers designed from conserved regions such as the ribosomal RNA (rRNA) genes that are universally conserved among bacteria (16S or 23S rRNA) or fungi and parasites (18S rRNA, 28S rRNA or internal transcribed spacer (ITS)). Other sets of primers can be designed to target a defined set of pathogens and/or genes and used for multiplex reverse transcription PCR or PCR (multiplexed amplicon PCR). NGS library preparation and sequencing of the resultant amplicons enable pathogen identification down to the genus or species level. Metagenomic sequencing (right) entails unbiased shotgun sequencing of all microbial and host nucleic acids present in a clinical sample. Separate DNA and RNA libraries are constructed; the DNA library is used for identification of bacteria, fungi, DNA viruses and parasites, whereas the RNA library is used for identification of RNA viruses and RNA sequencing-based human host transcriptome profiling (heat map, bottom right). As no primers or probes are used in unbiased mNGS, the vast majority of reads corresponds to the human host and, thus, detection of pathogens from metagenomic libraries is a ‘needle-in-a-haystack’ endeavour. An optional capture probe enrichment step using magnetic beads enables targeted mNGS of pathogens and/or genes from metagenomic libraries. All these methods are compatible with sequencing on traditional benchtop instruments such as the Illumina HiSeq and portable nanopore sequencers such as the Oxford Nanopore Technologies MinION.
Fig. 3. Challenges to routine deployment of metagenomic sequencing in the clinical setting.
At each step in the process, multiple factors (bullet points) must be taken into account when implementing a clinical metagenomic pipeline for diagnosis of infections to maximize accuracy and clinical relevance. In particular, it is often useful to interpret and discuss the results of metagenomic next-generation-sequencing (mNGS) testing in a clinical context as part of a clinical microbial sequencing board, akin to a tumour board in oncology. EMR, electronic medical record.
Fig. 4. A typical metagenomic next-generation sequencing bioinformatics pipeline.
A next-generation sequencing (NGS) data set, generally in FASTQ or sequence alignment map (SAM) format, is analysed on a computational server, portable laptop or desktop computer or on the cloud. An initial preprocessing step consists of low-quality filtering, low-complexity filtering and adaptor trimming. Computational host subtraction is performed by mapping reads to the host (for example, human) genome and setting aside host reads for subsequent transcriptome (RNA) or genome (DNA) analysis. The remaining unmapped reads are directly aligned to large reference databases, such as the
National Center for Biotechnology Information (NCBI)
GenBank database or microbial reference sequence or genome collections, or are first assembled de novo into longer contiguous sequences (contigs) followed by alignment to reference databases. After taxonomic classification, in which individual reads or contigs are assigned into specific taxa (for example, species, genus and family), the data can be analysed and visualized in a number of different formats. These include coverage map and pairwise identity plots to determine how much of the microbial genome has been recovered and its similarity to reference genomes in the database; Krona plots to visualize taxonomic diversity in the metagenomic library; phylogenetic analysis to compare assembled genes, gene regions or genomes to reference sequences; and heat maps to show microorganisms that were detected in the clinical samples. OTU, operational taxonomic unit.
Comment in
- A genomic approach to microbiology.
[No authors listed] [No authors listed] Nat Rev Genet. 2019 Jun;20(6):311. doi: 10.1038/s41576-019-0131-5. Nat Rev Genet. 2019. PMID: 31101903 No abstract available.
References
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical