imDEV: a graphical user interface to R multivariate analysis tools in Microsoft Excel - PubMed (original) (raw)

imDEV: a graphical user interface to R multivariate analysis tools in Microsoft Excel

Dmitry Grapov et al. Bioinformatics. 2012.

Abstract

Interactive modules for Data Exploration and Visualization (imDEV) is a Microsoft Excel spreadsheet embedded application providing an integrated environment for the analysis of omics data through a user-friendly interface. Individual modules enables interactive and dynamic analyses of large data by interfacing R's multivariate statistics and highly customizable visualizations with the spreadsheet environment, aiding robust inferences and generating information-rich data visualizations. This tool provides access to multiple comparisons with false discovery correction, hierarchical clustering, principal and independent component analyses, partial least squares regression and discriminant analysis, through an intuitive interface for creating high-quality two- and a three-dimensional visualizations including scatter plot matrices, distribution plots, dendrograms, heat maps, biplots, trellis biplots and correlation networks.

Availability and implementation: Freely available for download at http://sourceforge.net/projects/imdev/. Implemented in R and VBA and supported by Microsoft Excel (2003, 2007 and 2010).

PubMed Disclaimer

Figures

Fig. 1

Fig. 1

Analysis of the relationship between circulating metabolite levels and systolic blood pressure (sBP). Exploratory PCA was used to identify gender-specific differences in metabolite concentrations and sBP. A PLS model was developed for the prediction of sBP given gender-adjusted metabolite concentrations. Network analysis of the PLS model parameters was used to highlight a previously known relationship between sBP and GGT, and identified a novel group of related metabolites (Group 2) that are negatively correlated with both sBP and GGT. (A) PCA scores and loadings trellis-plots. Scores (bottom left) where size indicates sBP, while color and shape indicate gender (pink squares, female; blue diamonds, male), with an outlier highlighted in red. Loading (top right) sizes indicate _P_-values from Mann–Whitney U-test for gender-specific differences in metabolite concentrations. (B) Variable distribution and scatter plot matrix displaying the effect of covariate adjustment for gender on a representative variable. (C) Comparison of gender-adjusted sBP PLS model Q2 (C1) and RMSEP (C2) statistic distributions to their respective permuted null distributions. (D) A multi-dimensionally scaled PLS model correlation network visualizing correlations (Spearman's rho, P < 0.05) displayed by colored edges (orange, positive; blue, negative) between sBP (red diamond) and model parameters (triangles). Triangle (i.e. network vertex) characteristics encode PLS coefficient magnitude (size) and sign (upward, positive; inverted, negative). Major groups of correlated variables, defined by a hierarchal cluster analysis, are displayed by ellipses, and biologically related classes of metabolites are shown using similar vertex colors and polygons

Similar articles

Cited by

References

    1. Baier T, Neuwirth E. Excel :: COM :: R. Comput. Stat. 2007;22:91–108.
    1. Box GEP, Cox DR. An analysis of transformations. J. Roy. Stat. Soc. 1964;26:211–252.
    1. Castelo R, Roverato A. A robust procedure for Gaussian graphical model search from microarray data with p larger than n. J. Mach. Learn. Res. 2006;7:2621–2650.
    1. D'Agostino,R.B. and Stephens,M.A. (eds.) (1986) Goodness-of-Fit Techniques. Marcel Dekker, New York.
    1. De Jong S. SIMPLS: an alternative approach to partial least squares regression. Chemometr. Intell. Lab. 1993;18:251–263.

Publication types

MeSH terms

Grants and funding

LinkOut - more resources