Combining family- and population-based imputation data for association analysis of rare and common variants in large pedigrees - PubMed (original) (raw)
Combining family- and population-based imputation data for association analysis of rare and common variants in large pedigrees
Mohamad Saad et al. Genet Epidemiol. 2014 Nov.
Abstract
In the last two decades, complex traits have become the main focus of genetic studies. The hypothesis that both rare and common variants are associated with complex traits is increasingly being discussed. Family-based association studies using relatively large pedigrees are suitable for both rare and common variant identification. Because of the high cost of sequencing technologies, imputation methods are important for increasing the amount of information at low cost. A recent family-based imputation method, Genotype Imputation Given Inheritance (GIGI), is able to handle large pedigrees and accurately impute rare variants, but does less well for common variants where population-based methods perform better. Here, we propose a flexible approach to combine imputation data from both family- and population-based methods. We also extend the Sequence Kernel Association Test for Rare and Common variants (SKAT-RC), originally proposed for data from unrelated subjects, to family data in order to make use of such imputed data. We call this extension "famSKAT-RC." We compare the performance of famSKAT-RC and several other existing burden and kernel association tests. In simulated pedigree sequence data, our results show an increase of imputation accuracy from use of our combining approach. Also, they show an increase of power of the association tests with this approach over the use of either family- or population-based imputation methods alone, in the context of rare and common variants. Moreover, our results show better performance of famSKAT-RC compared to the other considered tests, in most scenarios investigated here.
Keywords: MCMC; association analysis; burden test; inheritance vectors; kernel statistic; mixed linear model; sequence data; variance components.
© 2014 WILEY PERIODICALS, INC.
Figures
Figure 1
Joint probabilities of possible genotypes (AA,Aa, aa) and their variances.
Figure 2
Correlation between allelic dosages obtained by GIGI and the true genotypes (x-axis) versus correlation between allelic dosages obtained by BEAGLE and the true genotypes (y-axis), for different bins of MAFs: A) LowLD pattern, B) HighLD pattern.
Figure 3
Correlation between allelic dosages obtained by GIGI+BEAGLE and the true genotypes (x-axes) versus correlation between allelic dosages obtained by: BEAGLE (first row figures), GIGI (second row figures), and the MAX between the correlations obtained by GIGI and BEAGLE (third row figures) with the true genotypes (y-axes). A) LowLD pattern, B) HighLD pattern. Left part of every LD pattern column figures: MAF>0.01; Right part of every LD pattern column figures: MAF<=0.01.
Figure 4
Power of famSKAT, famSKAT-B, famSKAT-RC, and famCMWS in the sequence data, under the LowLD pattern, for the different settings of number of associated and non-associated SNPs and the proportion of common SNPs among them;A) For a model with associated SNPs only:_A_=10, _fc_=0.3;_A_=10, _fc_=0.5;_A_=20, _fc_=0.3; and _A_=20, _fc_=0.5;B) For a model with associated and non-associated SNPs:_A_=10, _U_=20,_fc_=0.3;_A_=10, _U_=20,_fc_=0.5;_A_=20, _U_=40,_fc_=0.3; and_A_=20, _U_=40,fc_=0.5, where_fc is the proportion of common SNPs.
Figure 5
Power of famCMWS for the different imputation and the sequence data, for a model with associated SNPs only, for the different settings of number of associated SNPs and the proportion of common SNPs among them:_A_=10, _fc_=0.3;_A_=10, _fc_=0.5;_A_=20, _fc_=0.3; and _A_=20, _fc_=0.5, where fc is the proportion of common associated SNPs. A) LowLD pattern; B) HighLD pattern.
Figure 6
Power of famCMWS for the different combined imputation data (GIGI+BEAGLE, G+B+T, and G_S+B), under the LowLD pattern, for a model with associated SNPs only, for the different settings of number of associated SNPs and the proportion of common SNPs among them:_A_=10, _fc_=0.3;_A_=10, _fc_=0.5;_A_=20, _fc_=0.3; and _A_=20, _fc_=0.5, where fc is the proportion of common associated SNPs.
Similar articles
- Power of family-based association designs to detect rare variants in large pedigrees using imputed genotypes.
Saad M, Wijsman EM. Saad M, et al. Genet Epidemiol. 2014 Jan;38(1):1-9. doi: 10.1002/gepi.21776. Epub 2013 Nov 15. Genet Epidemiol. 2014. PMID: 24243664 Free PMC article. - GIGI: an approach to effective imputation of dense genotypes on large pedigrees.
Cheung CY, Thompson EA, Wijsman EM. Cheung CY, et al. Am J Hum Genet. 2013 Apr 4;92(4):504-16. doi: 10.1016/j.ajhg.2013.02.011. Am J Hum Genet. 2013. PMID: 23561844 Free PMC article. - A Comparison Study of Fixed and Mixed Effect Models for Gene Level Association Studies of Complex Traits.
Fan R, Chiu CY, Jung J, Weeks DE, Wilson AF, Bailey-Wilson JE, Amos CI, Chen Z, Mills JL, Xiong M. Fan R, et al. Genet Epidemiol. 2016 Dec;40(8):702-721. doi: 10.1002/gepi.21984. Epub 2016 Jul 4. Genet Epidemiol. 2016. PMID: 27374056 Free PMC article. - Testing genetic association with rare and common variants in family data.
Chen H, Malzahn D, Balliu B, Li C, Bailey JN. Chen H, et al. Genet Epidemiol. 2014 Sep;38 Suppl 1(0 1):S37-43. doi: 10.1002/gepi.21823. Genet Epidemiol. 2014. PMID: 25112186 Free PMC article. - Linkage analysis with sequential imputation.
Skrivanek Z, Lin S, Irwin M. Skrivanek Z, et al. Genet Epidemiol. 2003 Jul;25(1):25-35. doi: 10.1002/gepi.10249. Genet Epidemiol. 2003. PMID: 12813724 Review.
Cited by
- How local reference panels improve imputation in French populations.
Herzig AF, Velo-Suárez L; FrEx Consortium; FranceGenRef Consortium; Dina C, Redon R, Deleuze JF, Génin E. Herzig AF, et al. Sci Rep. 2024 Jan 3;14(1):370. doi: 10.1038/s41598-023-49931-3. Sci Rep. 2024. PMID: 38172507 Free PMC article. - Excalibur: A new ensemble method based on an optimal combination of aggregation tests for rare-variant association testing for sequencing data.
Boutry S, Helaers R, Lenaerts T, Vikkula M. Boutry S, et al. PLoS Comput Biol. 2023 Sep 14;19(9):e1011488. doi: 10.1371/journal.pcbi.1011488. eCollection 2023 Sep. PLoS Comput Biol. 2023. PMID: 37708232 Free PMC article. - A joint use of pooling and imputation for genotyping SNPs.
Clouard C, Ausmees K, Nettelblad C. Clouard C, et al. BMC Bioinformatics. 2022 Oct 13;23(1):421. doi: 10.1186/s12859-022-04974-7. BMC Bioinformatics. 2022. PMID: 36229780 Free PMC article. - Burden of Type 2 Diabetes and Associated Cardiometabolic Traits and Their Heritability Estimates in Endogamous Ethnic Groups of India: Findings From the INDIGENIUS Consortium.
Venkatesan V, Lopez-Alvarenga JC, Arya R, Ramu D, Koshy T, Ravichandran U, Ponnala AR, Sharma SK, Lodha S, Sharma KK, Shaik MV, Resendez RG, Venugopal P, R P, Saju N, Ezeilo JA, Bejar C, Wander GS, Ralhan S, Singh JR, Mehra NK, Vadlamudi RR, Almeida M, Mummidi S, Natesan C, Blangero J, Medicherla KM, Thanikachalam S, Panchatcharam TS, Kandregula DK, Gupta R, Sanghera DK, Duggirala R, Paul SFD. Venkatesan V, et al. Front Endocrinol (Lausanne). 2022 Apr 14;13:847692. doi: 10.3389/fendo.2022.847692. eCollection 2022. Front Endocrinol (Lausanne). 2022. PMID: 35498404 Free PMC article. - Alternative Applications of Genotyping Array Data Using Multivariant Methods.
Samuels DC, Below JE, Ness S, Yu H, Leng S, Guo Y. Samuels DC, et al. Trends Genet. 2020 Nov;36(11):857-867. doi: 10.1016/j.tig.2020.07.006. Epub 2020 Aug 6. Trends Genet. 2020. PMID: 32773169 Free PMC article. Review.
References
Publication types
MeSH terms
Grants and funding
- R01 HD054562/HD/NICHD NIH HHS/United States
- R01HD054562/HD/NICHD NIH HHS/United States
- P50 AG005136/AG/NIA NIH HHS/United States
- R01 MH094293/MH/NIMH NIH HHS/United States
- R01 AG039700/AG/NIA NIH HHS/United States
- R01 MH092367/MH/NIMH NIH HHS/United States
- R37 GM046255/GM/NIGMS NIH HHS/United States
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous