Piphillin: Improved Prediction of Metagenomic Content by Direct Inference from Human Microbiomes - PubMed (original) (raw)

Piphillin: Improved Prediction of Metagenomic Content by Direct Inference from Human Microbiomes

Shoko Iwai et al. PLoS One. 2016.

Abstract

Functional analysis of a clinical microbiome facilitates the elucidation of mechanisms by which microbiome perturbation can cause a phenotypic change in the patient. The direct approach for the analysis of the functional capacity of the microbiome is via shotgun metagenomics. An inexpensive method to estimate the functional capacity of a microbial community is through collecting 16S rRNA gene profiles then indirectly inferring the abundance of functional genes. This inference approach has been implemented in the PICRUSt and Tax4Fun software tools. However, those tools have important limitations since they rely on outdated functional databases and uncertain phylogenetic trees and require very specific data pre-processing protocols. Here we introduce Piphillin, a straightforward algorithm independent of any proposed phylogenetic tree, leveraging contemporary functional databases and not obliged to any singular data pre-processing protocol. When all three inference tools were evaluated against actual shotgun metagenomics, Piphillin was superior in predicting gene composition in human clinical samples compared to both PICRUSt and Tax4Fun (p<0.01 and p<0.001, respectively) and Piphillin's ability to predict disease associations with specific gene orthologs exhibited a 15% increase in balanced accuracy compared to PICRUSt. From laboratory animal samples, no performance advantage was observed for any one of the tools over the others and for environmental samples all produced unsatisfactory predictions. Our results demonstrate that functional inference using the direct method implemented in Piphillin is preferable for clinical biospecimens. Piphillin is publicly available for academic use at http://secondgenome.com/Piphillin.

PubMed Disclaimer

Conflict of interest statement

This work was supported in part by Second Genome Inc and Allergan PLC. Neil Poloso is employed by Allergen PLC and holds stock options. Shoko Iwai, Thomas Weinmaier, Karim Dabbagh, and Todd DeSantis are employed by Second Genome Inc. and hold stock options. Both Allergan PLC and Second Genome Inc. are independent therapeutics companies with products in development to treat gastrointestinal disorders and other human diseases. A publication announcing the availability of PiPhillin analysis for academic use will not affect the value of our therapeutic products. There are no PiPhillin patents, products in development or marketed products to declare. Second Genome, Inc. provides a commercial microbiome profiling service using software with demonstrable accuracy such as PiPhillin. This does not alter our adherence to all the PLOS ONE policies on sharing data and materials, as detailed online in the guide for authors.

Figures

Fig 1

Fig 1. Piphillin algorithm.

The representative sequence of each OTU in the sample is first searched against 16S rRNA sequences in the genome database to obtain inferred genome(s). Then the OTU abundance table is converted to a genome abundance table. The resulting table is normalized by the 16S rRNA copy number of each genome and a metagenome is inferred using the gene contents (copy number of each gene) of each genome in the database.

Fig 2

Fig 2. 16S rRNA gene amplicon sequences passing the identity threshold to the reference genomes.

Percentage of amplicon sequences from three datasets passing identity cutoffs from 0.75 to 1.00 against 16S rRNA gene sequences in the genome database were depicted. Green line, human feces dataset; blue line, human oral biopsy dataset; pink line, rat feces dataset; gray line, hypersaline microbial mat dataset.

Fig 3

Fig 3. Spearman’s correlation coefficient between Piphillin results and shotgun metagenomics at ten different identity cutoffs tested in Piphillin.

Spearman’s correlation coefficient was calculated for each sample and mean, 1st and 3rd quartiles are depicted by the boxes. Whiskers extend to the furthest points within 150% of the interquartile range. Green, human feces dataset; blue, human oral biopsy dataset; pink rat feces dataset; gray, hypersaline microbial mat dataset.

Fig 4

Fig 4. Sensitivity and specificity in identifying differentially abundant KOs from Piphillin against corresponding metagenomics.

(A) True positive rate and false positive rate of detecting significantly differentially abundant KOs in human oral biopsy sample. Numbers next to each point represent identity cutoff used for Piphillin. (B) Balanced accuracy of Piphillin at each identity cutoff.

Fig 5

Fig 5. Comparison between Piphillin, PICRUSt and Tax4Fun.

(A) Spearman’s correlation coefficient against corresponding shotgun metagenomics results were compared. Spearman’s correlation coefficient was calculated for each sample and ranges are depicted as box and whisker plots as described in Fig 3. Green, human feces dataset; blue, human oral biopsy dataset; pink rat feces dataset; gray, hypersaline microbial mat dataset. (B) False positive rate, true positive rate and balanced accuracy of detecting significant differences between cancer and healthy human oral biopsy samples were compared.

Fig 6

Fig 6. Comparison between Piphillin with 0.9 identity cutoff and two other approaches.

Spearman’s correlation coefficient against shotgun metagenomics results was calculated for hypersaline microbial mat dataset. Ranges are depicted as box and whisker plots as described in Fig 3.

Similar articles

Cited by

References

    1. Ivanov II, Atarashi K, Manel N, Brodie EL, Shima T, Karaoz U, et al. Induction of intestinal Th17 cells by segmented filamentous bacteria. Cell. 2009;139: 485–98. 10.1016/j.cell.2009.09.033 - DOI - PMC - PubMed
    1. Sokol H, Pigneur B, Watterlot L, Lakhdari O, Bermúdez-Humarán LG, Gratadoux J-J, et al. Faecalibacterium prausnitzii is an anti-inflammatory commensal bacterium identified by gut microbiota analysis of Crohn disease patients. Proc Natl Acad Sci U S A. 2008;105: 16731–6. 10.1073/pnas.0804812105 - DOI - PMC - PubMed
    1. Curtis MM, Hu Z, Klimko C, Narayanan S, Deberardinis R, Sperandio V. The gut commensal Bacteroides thetaiotaomicron exacerbates enteric infection through modification of the metabolic landscape. Cell Host Microbe. 2014;16: 759–769. 10.1016/j.chom.2014.11.005 - DOI - PMC - PubMed
    1. Wong J, Piceno YM, DeSantis TZ, Pahl M, Andersen GL, Vaziri ND. Expansion of urease- and uricase-containing, indole- and p-cresol-forming and contraction of short-chain fatty acid-producing intestinal microbiota in ESRD. Am J Nephrol. 2014;39: 230–237. 10.1159/000360010 - DOI - PMC - PubMed
    1. Langille MGI, Zaneveld J, Caporaso JG, McDonald D, Knights D, Reyes J a, et al. Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nat Biotechnol. Nature Publishing Group; 2013;31: 814–21. 10.1038/nbt.2676 - DOI - PMC - PubMed

MeSH terms

Substances

LinkOut - more resources