Visualization and statistical comparisons of microbial communities using R packages on Phylochip data - PubMed (original) (raw)

Visualization and statistical comparisons of microbial communities using R packages on Phylochip data

Susan Holmes et al. Pac Symp Biocomput. 2011.

Abstract

This article explains the statistical and computational methodology used to analyze species abundances collected using the LNBL Phylochip in a study of Irritable Bowel Syndrome (IBS) in rats. Some tools already available for the analysis of ordinary microarray data are useful in this type of statistical analysis. For instance in correcting for multiple testing we use Family Wise Error rate control and step-down tests (available in the multtest package). Once the most significant species are chosen we use the hypergeometric tests familiar for testing GO categories to test specific phyla and families. We provide examples of normalization, multivariate projections, batch effect detection and integration of phylogenetic covariation, as well as tree equalization and robustification methods.

PubMed Disclaimer

Figures

Fig. 1

Fig. 1

Tools were transposed from the standard microarray analyses

Fig. 2

Fig. 2

On the left the first plane of the PCA shows the first set of data with two batches and on the right the third set of arrays was added.

Fig. 3

Fig. 3

On the left, we have the tree of all operational taxonomic units (otus) present on the Phylochip, we can observe that the distance to the root of many of the otus is variable, thus indicating a heterogeneous degree of resolution. The two trees on the right are filtered trees representing only the 400 most abundant species. The blue tree on the right was computed by using the collapsing algorithm presented in this section, we see that the long right clade at the bottom of the middle tree has disappeared.

Fig. 4

Fig. 4

The left tree shows the complete tree on all species in black with the subtree of most abundant species in red, this subtree is the one plotted on the next panel. Values of abundance in CTL and IBS rats are plotted in the next two columns, the pink/blue scaled variables are the truncated rank differences between the two groups.

Fig. 5

Fig. 5

Principal component analysis of top most abundant and variable species, we see the Mucosal location is the explanation for the first component, all the mucosal samples have negative loadings on this factor.

Fig. 6

Fig. 6

Complete tree with the subtree of most variable among the consistently abundant species and the loadings on the first two principal components.

Fig. 7

Fig. 7

The left tree shows the complete tree on all species in black with the subtree of set of species that show the most significantly differences between CTL and IBS in red in the second panel. Values of abundance in CTL and IBS rats are plotted in the next two columns, the next column shows the – log(pvalue), so the largest bars represent the most significantly different species.

Similar articles

Cited by

References

    1. Nelson TA, Holmes S, Alekseyenko AV, Shenoy M, DeSantis T, Wu C, Anderson GL, Sonnenburg J, Pasricha PJ, Spormann A. Phylochip microarray analysis reveals altered gastrointestinal microbial communities in a rat model of colonic hypersensitivity. Submitted. - PMC - PubMed
    1. King T, Elia M, Hunter J. The Lancet. 1998 Jan; - PubMed
    1. Malinen E, Rinttilä T, Kajander K, Mättö J. The American journal of Gastroenterology. 2005 Jan;100:373. - PubMed
    1. Kassinen A, Krogius-Kurikka L, Mäkivuokko H. Gastroenterology. 2007 Jan; - PubMed
    1. Wilson KH, Wilson WJ, Radosevich JL, DeSantis TZ, Viswanathan VS, Kuczmarski TA, Andersen GL. Appl. Environ. Microbiol. 2002;68:2535. - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources