Gene set enrichment analysis: performance evaluation and usage guidelines - PubMed (original) (raw)

Review

Gene set enrichment analysis: performance evaluation and usage guidelines

Jui-Hung Hung et al. Brief Bioinform. 2012 May.

Abstract

A central goal of biology is understanding and describing the molecular basis of plasticity: the sets of genes that are combinatorially selected by exogenous and endogenous environmental changes, and the relations among the genes. The most viable current approach to this problem consists of determining whether sets of genes are connected by some common theme, e.g. genes from the same pathway are overrepresented among those whose differential expression in response to a perturbation is most pronounced. There are many approaches to this problem, and the results they produce show a fair amount of dispersion, but they all fall within a common framework consisting of a few basic components. We critically review these components, suggest best practices for carrying out each step, and propose a voting method for meeting the challenge of assessing different methods on a large number of experimental data sets in the absence of a gold standard.

PubMed Disclaimer

Figures

Figure 1:

Figure 1:

Key components of performing gene set enrichment analysis.

Figure 2:

Figure 2:

_P_-value distribution of null by (A) simulated background and (B) analytical background. It is clear that analytical backgrounds give biased _P_-value distributions. WKS (i.e. GSEA) is not shown in (B), because WKS does not follow an analytical background.

Figure 3:

Figure 3:

The Pearson correlation coefficient between all 10 gene-set statistics. The ‘W’- prefix indicates a TIF weighted statistic.

Similar articles

Cited by

References

    1. Molecular Signatures Database v3.0. Available from: http://www.broadinstitute.org/gsea/msigdb/index.jsp (9 September 2010, date last accessed)
    1. Moriya Y, Itoh M, Okuda S, et al. KAAS: an automatic genome annotation and pathway reconstruction server. Nucleic Acids Res. 2007;35:W182–5. - PMC - PubMed
    1. Gilchrist A, Au CE, Hiding J, et al. Quantitative proteomics analysis of the secretory pathway. Cell. 2006;127(6):1265–81. - PubMed
    1. Koller A, Washburn MP, Lange BM, et al. Proteomic survey of metabolic pathways in rice. Proc Natl Acad Sci USA. 2002;99(18):11969–74. - PMC - PubMed
    1. Ye Y, Doak TG. A parsimony approach to biological pathway reconstruction/inference for genomes and metagenomes. PLoS Comput Biol. 2009;5(8):e1000465. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources