Adaptive gene- and pathway-trait association testing with GWAS summary statistics - PubMed (original) (raw)

Adaptive gene- and pathway-trait association testing with GWAS summary statistics

Il-Youp Kwak et al. Bioinformatics. 2016.

Abstract

Background: Gene- and pathway-based analyses offer a useful alternative and complement to the usual single SNP-based analysis for GWAS. On the other hand, most existing gene- and pathway-based tests are not highly adaptive, and/or require the availability of individual-level genotype and phenotype data. It would be desirable to have highly adaptive tests applicable to summary statistics for single SNPs. This has become increasingly important given the popularity of large-scale meta-analyses of multiple GWASs and the practical availability of either single GWAS or meta-analyzed GWAS summary statistics for single SNPs.

Results: We extend two adaptive tests for gene- and pathway-level association with a univariate trait to the case with GWAS summary statistics without individual-level genotype and phenotype data. We use the WTCCC GWAS data to evaluate and compare the proposed methods and several existing methods. We further illustrate their applications to a meta-analyzed dataset to identify genes and pathways associated with blood pressure, demonstrating the potential usefulness of the proposed methods. The methods are implemented in R package aSPU, freely and publicly available.

Availability and implementation: https://cran.r-project.org/web/packages/aSPU/ CONTACT: weip@biostat.umn.edu

Supplementary information: Supplementary data are available at Bioinformatics online.

© The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

PubMed Disclaimer

Figures

Fig. 1.

Fig. 1.

QQ plots of aSPUs _P_-values from a Control–Control analysis (top rows) and a Case–Control analysis (bottom row) of the WTCCC CD data. The correlation matrices were estimated using the 90 Hapmap CEU samples, 100 and all 2938 WTCCC controls respectively

Fig. 2.

Fig. 2.

Comparison of −log10 transformed _P_-values of aSPUs with correlation matrices estimated from the Hapmap panel versus from all WTCCC controls. The dotted lines indicate the significance threshold. The Pearson correlation coefficients in the two panels are both 0.98

Fig. 3.

Fig. 3.

Comparison of −log10 transformed _P_-values of aSPUs with the simulation number B=105 versus B=106. The dotted lines indicate the Bonferroni-corrected significance threshold

Fig. 4.

Fig. 4.

Comparison of −log10 transformed p-values with the Hapmap samples versus all WTCCC controls for three pathway-level tests. The dotted lines indicate the Bonferroni-corrected significance threshold. The _P_-values <0.00001 pointed as 0.00001 for easier comparisons among plots

Fig. 5.

Fig. 5.

Venn diagram for the significant genes identified by aSPUs, VEGAS, GATES and MAGMA, for trait DBP. The genes with a star (*) are BP-related genes in Table 1 of Ehret et al. (2011)

Fig. 6.

Fig. 6.

The significant gene ULK4 that was uniquely identified by aSPUs. The locus was also one of 28 loci identified by Ehret et al. (2011) by single SNP analysis with a much larger sample size

Fig. 7.

Fig. 7.

Venn diagram for the significant KEGG pathways identified by aSPUsPath, GATES-Simes, HYST and MAGMA, for trait DBP

References

    1. Consortium,T.W.T.C. C. (2007) Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature, 447, 661–678. - PMC - PubMed
    1. de Bakker P. et al. (2008) Practical aspects of imputation-driven meta-analysis of genome-wide association studies. Hum. Mol. Genet., 17, R122–R128. - PMC - PubMed
    1. de Leeuw C.A. et al. (2015) Magma: generalized gene-set analysis of gwas data. PLoS Comput. Biol., 11, e1004219. - PMC - PubMed
    1. Ehret G.B. et al. (2011) Genetic variants in novel pathways influence blood pressure and cardiovascular disease risk. Nature, 478, 103–109. - PMC - PubMed
    1. Fan R. et al. (2015) Gene level meta-analysis of quantitative traits by functional linear models. Genetics, 200, 1089–1104. - PMC - PubMed

MeSH terms

Grants and funding

LinkOut - more resources