TEAM: efficient two-locus epistasis tests in human genome-wide association study - PubMed (original) (raw)
TEAM: efficient two-locus epistasis tests in human genome-wide association study
Xiang Zhang et al. Bioinformatics. 2010.
Abstract
As a promising tool for identifying genetic markers underlying phenotypic differences, genome-wide association study (GWAS) has been extensively investigated in recent years. In GWAS, detecting epistasis (or gene-gene interaction) is preferable over single locus study since many diseases are known to be complex traits. A brute force search is infeasible for epistasis detection in the genome-wide scale because of the intensive computational burden. Existing epistasis detection algorithms are designed for dataset consisting of homozygous markers and small sample size. In human study, however, the genotype may be heterozygous, and number of individuals can be up to thousands. Thus, existing methods are not readily applicable to human datasets. In this article, we propose an efficient algorithm, TEAM, which significantly speeds up epistasis detection for human GWAS. Our algorithm is exhaustive, i.e. it does not ignore any epistatic interaction. Utilizing the minimum spanning tree structure, the algorithm incrementally updates the contingency tables for epistatic tests without scanning all individuals. Our algorithm has broader applicability and is more efficient than existing methods for large sample study. It supports any statistical test that is based on contingency tables, and enables both family-wise error rate and false discovery rate controlling. Extensive experiments show that our algorithm only needs to examine a small portion of the individuals to update the contingency tables, and it achieves at least an order of magnitude speed up over the brute force approach.
Figures
Fig. 1.
The minimum spanning tree built on the SNPs in the example dataset shown in Table 1.
Fig. 2.
Comparison between TEAM and the brute force approach on human datasets under various experimental settings: varying the number of SNPs (a), individuals (b), permutations (c) and varying the case/control ratio (d).
Fig. 3.
Comparison between TEAM, COE and the brute force approach on mouse datasets under various experimental settings: (a) varying the number of SNPs and (b) varying the number of individuals.
Similar articles
- Prioritizing tests of epistasis through hierarchical representation of genomic redundancies.
Cowman T, Koyutürk M. Cowman T, et al. Nucleic Acids Res. 2017 Aug 21;45(14):e131. doi: 10.1093/nar/gkx505. Nucleic Acids Res. 2017. PMID: 28605458 Free PMC article. - Cloud computing for detecting high-order genome-wide epistatic interaction via dynamic clustering.
Guo X, Meng Y, Yu N, Pan Y. Guo X, et al. BMC Bioinformatics. 2014 Apr 10;15:102. doi: 10.1186/1471-2105-15-102. BMC Bioinformatics. 2014. PMID: 24717145 Free PMC article. - A method for detecting epistasis in genome-wide studies using case-control multi-locus association analysis.
Gayán J, González-Pérez A, Bermudo F, Sáez ME, Royo JL, Quintas A, Galan JJ, Morón FJ, Ramirez-Lorca R, Real LM, Ruiz A. Gayán J, et al. BMC Genomics. 2008 Jul 31;9:360. doi: 10.1186/1471-2164-9-360. BMC Genomics. 2008. PMID: 18667089 Free PMC article. - Finding the epistasis needles in the genome-wide haystack.
Ritchie MD. Ritchie MD. Methods Mol Biol. 2015;1253:19-33. doi: 10.1007/978-1-4939-2155-3_2. Methods Mol Biol. 2015. PMID: 25403525 Review. - Detecting epistasis in human complex traits.
Wei WH, Hemani G, Haley CS. Wei WH, et al. Nat Rev Genet. 2014 Nov;15(11):722-33. doi: 10.1038/nrg3747. Epub 2014 Sep 9. Nat Rev Genet. 2014. PMID: 25200660 Review.
Cited by
- An evolutionary perspective on epistasis and the missing heritability.
Hemani G, Knott S, Haley C. Hemani G, et al. PLoS Genet. 2013 Feb;9(2):e1003295. doi: 10.1371/journal.pgen.1003295. Epub 2013 Feb 28. PLoS Genet. 2013. PMID: 23509438 Free PMC article. - Chapter 10: Mining genome-wide genetic markers.
Zhang X, Huang S, Zhang Z, Wang W. Zhang X, et al. PLoS Comput Biol. 2012;8(12):e1002828. doi: 10.1371/journal.pcbi.1002828. Epub 2012 Dec 27. PLoS Comput Biol. 2012. PMID: 23300418 Free PMC article. - Large scale association analysis identifies three susceptibility loci for coronary artery disease.
Saade S, Cazier JB, Ghassibe-Sabbagh M, Youhanna S, Badro DA, Kamatani Y, Hager J, Yeretzian JS, El-Khazen G, Haber M, Salloum AK, Douaihy B, Othman R, Shasha N, Kabbani S, Bayeh HE, Chammas E, Farrall M, Gauguier D, Platt DE, Zalloua PA; FGENTCARD consortium. Saade S, et al. PLoS One. 2011;6(12):e29427. doi: 10.1371/journal.pone.0029427. Epub 2011 Dec 27. PLoS One. 2011. PMID: 22216278 Free PMC article. - A FAST ALGORITHM FOR DETECTING GENE-GENE INTERACTIONS IN GENOME-WIDE ASSOCIATION STUDIES.
Li J, Zhong W, Li R, Wu R. Li J, et al. Ann Appl Stat. 2014;8(4):2292-2318. doi: 10.1214/14-aoas771. Ann Appl Stat. 2014. PMID: 26457126 Free PMC article. - Practical aspects of genome-wide association interaction analysis.
Gusareva ES, Van Steen K. Gusareva ES, et al. Hum Genet. 2014 Nov;133(11):1343-58. doi: 10.1007/s00439-014-1480-y. Epub 2014 Aug 28. Hum Genet. 2014. PMID: 25164382 Review.
References
- Balding DJ. A tutorial on statistical methods for population association studies. Nat. Rev. Genet. 2006;7:781–791. - PubMed
- Cormen TH, et al. Introduction to Algorithms. MIT Press and McGraw-Hill; 2001.
- Dudoit S, Laan MJ. Multiple Testing Procedures with Applications to Genomics. Springer; 2008.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Research Materials