Comparison of three methods to estimate genetic ancestry and control for stratification in genetic association studies among admixed populations (original) (raw)
Abstract
Population stratification may confound the results of genetic association studies among unrelated individuals from admixed populations. Several methods have been proposed to estimate the ancestral information in admixed populations and used to adjust the population stratification in genetic association tests. We evaluate the performances of three different methods: maximum likelihood estimation, ADMIXMAP and Structure through various simulated data sets and real data from Latino subjects participating in a genetic study of asthma. All three methods provide similar information on the accuracy of ancestral estimates and control type I error rate at an approximately similar rate. The most important factor in determining accuracy of the ancestry estimate and in minimizing type I error rate is the number of markers used to estimate ancestry. We demonstrate that approximately 100 ancestry informative markers (AIMs) are required to obtain estimates of ancestry that correlate with correlation coefficients more than 0.9 with the true individual ancestral proportions. In addition, after accounting for the ancestry information in association tests, the excess of type I error rate is controlled at the 5% level when 100 markers are used to estimate ancestry. However, since the effect of admixture on the type I error rate worsens with sample size, the accuracy of ancestry estimates also needs to increase to make the appropriate correction. Using data from the Latino subjects, we also apply these methods to an association study between body mass index and 44 AIMs. These simulations are meant to provide some practical guidelines for investigators conducting association studies in admixed populations.
Access this article
Subscribe and save
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime Subscribe now
Buy Now
Price excludes VAT (USA)
Tax calculation will be finalised during checkout.
Instant access to the full article PDF.
Similar content being viewed by others
References
- Bacanu SA, Devlin B, Roeder K (2000) The power of genomic control. Am J Hum Genet 66:1933–1944
Article PubMed CAS Google Scholar - Bonilla C, Parra EJ, Pfaff CL, Dios S, Marshall JA, Hamman RF, Ferrell RE, Hoggart CL, McKeigue PM, Shriver MD (2004) Admixture in the Hispanics of the San Luis Valley, Colorado, and its implications for complex trait gene mapping. Ann Hum Genet 68:139–153
Article PubMed CAS Google Scholar - Burchard EG, Avila PC, Nazario S, Casal J, Torres A, Rodriguez-Santana JR, Toscano M, Sylvia JS, Alioto M, Salazar M, Gomez I, Fagan JK, Salas J, Lilly C, Matallana H, Ziv E, Castro R, Selman M, Chapela R, Sheppard D, Weiss ST, Ford JG, Boushey HA, Rodriguez-Cintron W, Drazen JM, Silverman EK (2004) Lower bronchodilator responsiveness in Puerto Rican than in Mexican subjects with asthma. Am J Respir Crit Care Med 169:386–392
Article PubMed Google Scholar - Burchard EG, Ziv E, Coyle N, Gomez SL, Tang H, Karter AJ, Mountain JL, Perez-Stable EJ, Sheppard D, Risch N (2003) The importance of race and ethnic background in biomedical research and clinical practice. N Engl J Med 348:1170–1175
Article PubMed Google Scholar - Cardon LR, Bell JI (2001) Association study designs for complex diseases. Nat Rev Genet 2:91–99
Article PubMed CAS Google Scholar - Chakraborty R, Ferrell RE, Stern MP, Haffner SM, Hazuda HP, Rosenthal M (1986) Relationship of prevalence of non-insulin-dependent diabetes mellitus to Amerindian admixture in the Mexican Americans of San Antonio, Texas. Genet Epidemiol 3:435–454
Article PubMed CAS Google Scholar - Devlin B, Roeder K (1999) Genomic control for association studies. Biometrics 55:997–1004
Article PubMed CAS Google Scholar - Devlin B, Roeder K, Wasserman L (2001) Genomic control, a new approach to genetic-based association studies. Theor Popul Biol 60:155–166
Article PubMed CAS Google Scholar - Falush D, Stephens M, Pritchard JK (2003) Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164:1567–1587
PubMed CAS Google Scholar - Hanis CL, Chakraborty R, Ferrell RE, Schull WJ (1986) Individual admixture estimates: disease associations and individual risk of diabetes and gallbladder disease among Mexican–Americans in Starr County, Texas. Am J Phys Anthropol 70:433–441
Article PubMed CAS Google Scholar - Hoggart CJ, Parra EJ, Shriver MD, Bonilla C, Kittles RA, Clayton DG, McKeigue PM (2003) Control of confounding of genetic associations in stratified populations. Am J Hum Genet 72:1492–1504
Article PubMed CAS Google Scholar - Hoggart CJ, Shriver MD, Kittles RA, Clayton DG, McKeigue PM (2004) Design and analysis of admixture mapping studies. Am J Hum Genet 74:965–978
Article PubMed CAS Google Scholar - King TE Jr (2002) Racial disparities in clinical trials. N Engl J Med 346:1400–1402
Article PubMed Google Scholar - Knowler WC, Williams RC, Pettitt DJ, Steinberg AG (1988) Gm3;5,13,14 and type 2 diabetes mellitus: an association in American Indians with genetic admixture. Am J Hum Genet 43:520–526
PubMed CAS Google Scholar - Lander ES, Schork NJ (1994) Genetic dissection of complex traits. Science 265:2037–2048
Article PubMed CAS Google Scholar - Marchini J, Cardon LR, Phillips MS, Donnelly P (2004) The effects of human population structure on large genetic association studies. Nat Genet 36:512–517
Article PubMed CAS Google Scholar - McKeigue PM, Carpenter JR, Parra EJ, Shriver MD (2000) Estimation of admixture and detection of linkage in admixed populations by a Bayesian approach: application to African–American populations. Ann Hum Genet 64:171–186
Article PubMed CAS Google Scholar - Pritchard JK, Rosenberg NA (1999) Use of unlinked genetic markers to detect population stratification in association studies. Am J Hum Genet 65:220–228
Article PubMed CAS Google Scholar - Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155:945–959
PubMed CAS Google Scholar - Risch N, Merikangas K (1996) The future of genetic studies of complex human diseases. Science 273: 1516–1517
Article PubMed CAS Google Scholar - Rosenberg NA, Li LM, Ward R, Pritchard JK (2003) Informativeness of genetic markers for inference of ancestry. Am J Hum Genet 73:1402–1422
Article PubMed CAS Google Scholar - Satten GA, Flanders WD, Yang Q (2001) Accounting for unmeasured population substructure in case-control studies of genetic association using a novel latent-class model. Am J Hum Genet 68:466–477
Article PubMed CAS Google Scholar - Snyder EE, Walts B, Perusse L, Chagnon YC, Weisnagel SJ, Rankinen T, Bouchard C (2004) The human obesity gene map: the 2003 update. Obes Res 12:369–439
Article PubMed CAS Google Scholar - Spielman RS, McGinnis RE, Ewens WJ (1993) Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM). Am J Hum Genet 52:506–516
PubMed CAS Google Scholar - Tang H, Peng J, Wang P, Risch NJ (2005) Estimation of individual admixture: analytical and study design considerations. Genet Epidemiol 28:289–301
Article PubMed Google Scholar - Wagner DR, Heyward VH (2000) Measures of body composition in blacks and whites: a comparative review. Am J Clin Nutr 71:1392–1402
PubMed CAS Google Scholar - Wright S (1969) Evolution and the genetics of populations, vol 2: the theory of gene frequencies. University of Chicago Press, Chicago
- Zhang S, Zhao H (2001) Quantitative similarity-based association tests using population samples. Am J Hum Genet 69:601–614
Article PubMed CAS Google Scholar - Zhang S, Zhu X, Zhao H (2003) On a semiparametric test to detect associations between quantitative traits and candidate genes using unrelated individuals. Genet Epidemiol 24:44–56
Article PubMed Google Scholar - Zhu X, Zhang S, Zhao H, Cooper RS (2002) Association mapping, using a mixture model for complex traits. Genet Epidemiol 23:181–196
Article PubMed Google Scholar - Ziv E, Burchard EG (2003) Human population structure and genetic association studies. Pharmacogenomics 4:431–441
Article PubMed Google Scholar
Acknowledgements
Financial support was received from HL07185, GM61390, American Lung Association of California, RWJ Amos Medical Faculty Development Award, NCMHD Health Disparities Scholar, Extramural Clinical Research Loan Repayment Program for Individuals from Disadvantaged Backgrounds, 2001–2003, to Esteban González Burchard, K22CA109351, from the NIH, CRTG 02-0841-CCE from the American Cancer Society, and BCRP030551 from the Department of Defense to Elad Ziv, U19AG23122 from NIH to Steven Cummings, HL51823, HL074204, 3M01RR000083-38S30488, HL56443 and HL51831 to the Asthma Clinical Research Network, U01-CA86117, SFGH General Clinical Research Center M01RR00083-41, U01-HL 65899, UCSF-Children’s Hospital of Oakland Pediatric Clinical Research Center (M01 RR01271), Oakland, CA, Sandler Center for Basic Research in Asthma and the Sandler Family Supporting Foundation. The authors would like to acknowledge the families and the patients for their participation. The authors would also like to thank the numerous health care providers and community clinics for their support and participation in the GALA Study. In addition to the primary clinical centers of the investigators, participating community clinics and hospitals include: La Clinica de La Raza, Oakland, CA; UCSF-Children’s Hospital of Oakland Pediatric Clinical Research Center, Oakland, CA; General Clinical Research Center, SFGH, San Francisco, CA; Alliance Medical Center, Healdsburg, CA; Santa Clara Valley Medical Center, San José, CA; Fair Oaks Family Health Center, Redwood City, CA; Clinica de Salud del Valle de Salinas, Salinas, CA; Natividad Medical Center, Salinas, CA; Asthma Education and Management Program, Community Medical Centers, Fresno, CA., Diagnostic Health Centers of: Corozal, Naranjito, Catano, Orocovis, Barranquitas and San Antonio Hospital of Mayaguez. The authors would also like to acknowledge Monica Toscano, MariaElena Alioto, Ivan Gomez, Henry Matallana, Carmen Jimenez, Yannett Marcano, Pedro Yapor, Alma Ortiz, Lisandra Perez and Sheila Gonzalez for their assistance with recruitment and study organization. The authors would like to especially thank Dr. Jeffrey M. Drazen, Dr. Ed Silverman, Dr. Homer A. Boushey, Dr. Dean Shepaprd, Dr. Sylvette Nazario, Dr. Jesus Casal, Dr. Alfonso Torres, Dr. Jose Rodriguez-Santana, Dr. Rocio Chapella, Dr. Scott Weiss, and Dr. Jean G. Ford for all of their effort towards the creation of the GALA Study and to Dr. Mark D. Shriver for assistance in development of the AIMs and for providing ancestral DNA.
Author information
Authors and Affiliations
- Department of Medicine, University of California, Box 0833, San Francisco, CA, 94143-0833, USA
Hui-Ju Tsai, Shweta Choudhry, Mariam Naqvi, Esteban González Burchard & Elad Ziv - Lung Biology Center, San Francisco General Hospital, San Francisco, CA, USA
Hui-Ju Tsai, Shweta Choudhry, Mariam Naqvi & Esteban González Burchard - Center for Human Genetics, University of California, San Francisco, CA, USA
Esteban González Burchard & Elad Ziv - San Juan VAMC, University of Puerto Rico School of Medicine, San Juan, PR, USA
William Rodriguez-Cintron
Authors
- Hui-Ju Tsai
You can also search for this author inPubMed Google Scholar - Shweta Choudhry
You can also search for this author inPubMed Google Scholar - Mariam Naqvi
You can also search for this author inPubMed Google Scholar - William Rodriguez-Cintron
You can also search for this author inPubMed Google Scholar - Esteban González Burchard
You can also search for this author inPubMed Google Scholar - Elad Ziv
You can also search for this author inPubMed Google Scholar
Corresponding author
Correspondence toHui-Ju Tsai.
Additional information
Esteban González Burchard and Elad Ziv contributed equally to this manuscript.
Electronic supplementary material
Rights and permissions
About this article
Cite this article
Tsai, HJ., Choudhry, S., Naqvi, M. et al. Comparison of three methods to estimate genetic ancestry and control for stratification in genetic association studies among admixed populations.Hum Genet 118, 424–433 (2005). https://doi.org/10.1007/s00439-005-0067-z
- Received: 01 June 2005
- Accepted: 10 August 2005
- Published: 06 October 2005
- Issue Date: December 2005
- DOI: https://doi.org/10.1007/s00439-005-0067-z