GitHub - MicrobialGenomics-IrsicaixaOrg/dar: dar: runs multiple differential abundance analysis methods and through a consensus strategy returns a set of differentially abundant features. (original) (raw)
dar 
Introduction
Differential abundance testing in microbiome data challenges both parametric and non-parametric statistical methods, due to its sparsity, high variability and compositional nature. Microbiome-specific statistical methods often assume classical distribution models or take into account compositional specifics. These produce results that range within the specificity vs sensitivity space in such a way that type I and type II error are difficult to ascertain in real microbiome data when a single method is used. Recently, a consensus approach based on multiple differential abundance (DA) methods was recently suggested in order to increase robustness.
With dar, you can use dplyr-like pipeable sequences of DA methods and then apply different consensus strategies. In this way we can obtain more reliable results in a fast, consistent and reproducible way.
Installation
You can install the development version of dar fromGitHub with:
install.packages("devtools")
devtools::install_github("MicrobialGenomics-IrsicaixaOrg/dar")
Usage
library(dar)
#> Registered S3 methods overwritten by 'vegan':
#> method from
#> reorder.hclust seriation
#> rev.hclust dendextend
data("metaHIV_phy")
Define recipe
rec <- recipe(metaHIV_phy, var_info = "RiskGroup2", tax_info = "Species") %>% step_subset_taxa(expr = 'Kingdom %in% c("Bacteria", "Archaea")') %>% step_filter_taxa(.f = "function(x) sum(x > 0) >= (0.03 * length(x))") %>% step_metagenomeseq(rm_zeros = 0.01) %>% step_maaslin()
rec #> ── DAR Recipe ────────────────────────────────────────────────────────────────── #> Inputs: #> #> ℹ phyloseq object with 451 taxa and 156 samples #> ℹ variable of interes RiskGroup2 (class: character, levels: hts, msm, pwid) #> ℹ taxonomic level Species #> #> Preporcessing steps: #> #> ◉ step_subset_taxa() id = subset_taxa__Suncake #> ◉ step_filter_taxa() id = filter_taxa__Hot_water_crust_pastry #> #> DA steps: #> #> ◉ step_metagenomeseq() id = metagenomeseq__Crocetta_of_Caltanissetta #> ◉ step_maaslin() id = maaslin__Tortita_negra
Prep recipe
da_results <- prep(rec, parallel = TRUE) da_results #> ── DAR Results ───────────────────────────────────────────────────────────────── #> Inputs: #> #> ℹ phyloseq object with 278 taxa and 156 samples #> ℹ variable of interes RiskGroup2 (class: character, levels: hts, msm, pwid) #> ℹ taxonomic level Species #> #> Results: #> #> ✔ metagenomeseq__Crocetta_of_Caltanissetta diff_taxa = 236 #> ✔ maaslin__Tortita_negra diff_taxa = 146 #> #> ℹ 124 taxa are present in all tested methods
Consensus strategy
n_methods <- 2 da_results <- bake(da_results, count_cutoff = n_methods) da_results #> ── DAR Results ───────────────────────────────────────────────────────────────── #> Inputs: #> #> ℹ phyloseq object with 278 taxa and 156 samples #> ℹ variable of interes RiskGroup2 (class: character, levels: hts, msm, pwid) #> ℹ taxonomic level Species #> #> Results: #> #> ✔ metagenomeseq__Crocetta_of_Caltanissetta diff_taxa = 236 #> ✔ maaslin__Tortita_negra diff_taxa = 146 #> #> ℹ 124 taxa are present in all tested methods #> #> Bakes: #> #> ◉ 1 -> count_cutoff: 2, weights: NULL, exclude: NULL, id: bake__Kürtőskalács
Results
cool(da_results)
#> ℹ Bake for count_cutoff = 2
#> # A tibble: 124 × 2
#> taxa_id taxa
#>
#> 1 Otu_63 Bacteroides_plebeius
#> 2 Otu_216 Clostridium_sp_CAG_632
#> 3 Otu_441 Brachyspira_sp_CAG_700
#> 4 Otu_108 Prevotella_sp_CAG_520
#> 5 Otu_257 Butyrivibrio_sp_CAG_318
#> 6 Otu_104 Prevotella_sp_CAG_1092
#> 7 Otu_69 Bacteroides_sp_CAG_530
#> 8 Otu_102 Prevotella_sp_AM42_24
#> 9 Otu_159 Lactobacillus_ruminis
#> 10 Otu_117 Alistipes_inops
#> # ℹ 114 more rows
Contributing
- If you think you have encountered a bug, please submit an issue.
- Either way, learn how to create and share areprex(a minimal, reproducible example), to clearly communicate about your code.
- Working on your first Pull Request? You can learn how from this _free_series How to Contribute to an Open Source Project on GitHub
Code of Conduct
Please note that the dar project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.