Simulation A in "Waste Not, Want Not" · Issue #299 · joey711/phyloseq (original) (raw)

Hi Joey,

With my dataset, I'm most interested in whether my samples cluster together according to different factors (simulation A in your paper), rather than what are the specific OTU's that are differentially abundant between groups (simulation B).

Based on Figure 3, it looks like using proportions of reads has higher accuracy for Bray-curtis and weighted unifrac than other models. In your paper you explain very clearly why using proportions is essentially throwing out data, but the figure suggests that especially for low effect sizes proportions may be more accurate. Can you clarify these results?

Also, I've played around with Deseq2 using your phyloseq_to_deseq function, but I'm confused about how you would use the variance stabilization implemented in that package to make for instance an NMDS ordination. Can you provide an example?

Thanks
Michelle