Genome-wide Association Studies in Ancestrally Diverse Populations: Opportunities, Methods, Pitfalls, and Recommendations (original) (raw)
Related papers
Accounting for ancestry: population substructure and genome-wide association studies
Human Molecular Genetics, 2008
Accounting for the genetic substructure of human populations has become a major practical issue for studying complex genetic disorders. Allele frequency differences among ethnic groups and subgroups and admixture between different ethnic groups can result in frequent false-positive results or reduced power in genetic studies. Here, we review the problems and progress in defining population differences and the application of statistical methods to improve association studies. It is now possible to take into account the confounding effects of population stratification using thousands of unselected genome-wide single-nucleotide polymorphisms or, alternatively, selected panels of ancestry informative markers. These methods do not require any demographic information and therefore can be widely applied to genotypes available from multiple sources. We further suggest that it will be important to explore results in homogeneous population subsets as we seek to define the extent to which genomic variation influences complex phenotypes.
Consistency of genome-wide associations across major ancestral groups
Human genetics, 2012
It is not well known whether genetic markers identified through genome-wide association studies (GWAS) confer similar or different risks across people of different ancestry. We screened a regularly updated catalog of all published GWAS curated at the NHGRI website for GWAS-identified associations that had reached genome-wide significance (p ≤ 5 × 10−8) in at least one major ancestry group (European, Asian, African) and for which replication data were available for comparison in at least two different major ancestry groups. These groups were compared for the correlation between and differences in risk allele frequencies and genetic effects’ estimates. Data on 108 eligible GWAS-identified associations with a total of 900 datasets (European, n = 624; Asian, n = 217; African, n = 60) were analyzed. Risk-allele frequencies were modestly correlated between ancestry groups, with >10% absolute differences in 75–89% of the three pairwise comparisons of ancestry groups. Genetic effect (odds ratio) point estimates between ancestry groups correlated modestly (pairwise comparisons’ correlation coefficients: 0.20–0.33) and point estimates of risks were opposite in direction or differed more than twofold in 57%, 79%, and 89% of the European versus Asian, European versus African, and Asian versus African comparisons, respectively. The modest correlations, differing risk estimates, and considerable between-association heterogeneity suggest that differential ancestral effects can be anticipated and genomic risk markers may need separate further evaluation in different ancestry groups.
To investigate cross-ancestry genetics of complex traits, we conducted a phenome-wide analysis of loci with heterogeneous effects across African, Admixed-American, Central/South Asian, East Asian, European and Middle Eastern participants of the UK Biobank (N = 441 331). Testing 843 phenotypes, we identified 82 independent genomic regions mapping variants showing genome-wide significant (GWS) associations (P < 5 × 10 −8) in the trans-ancestry meta-analysis and GWS heterogeneity among the ancestry-specific effects. These included (i) loci with GWS association in one ancestry and concordant but heterogeneous effects among the other ancestries and (ii) loci with a GWS association in one ancestry group and an experiment-wide significant discordant effect (P < 6.1 × 10 −4) in at least another ancestry. Since the trans-ancestry GWS associations were mostly driven by the European ancestry sample size, we investigated the differences of the allele frequency (AF) and linkage disequilibrium regulome tagging (LD) between European populations and the other ancestries. Within loci with concordant effects, the degree of heterogeneity was associated with European-Middle Eastern AF (P = 9.04 × 10 −6) and LD of European populations with respect to African, Admixed-American and Central/South Asian groups (P = 8.21 × 10 −4 , P = 7.17 × 10 −4 and P = 2.16 × 10 −3 , respectively). Within loci with discordant effects, AF and LD of European populations with respect to African and Central/South Asian ancestries were associated with the degree of heterogeneity (AF : P = 7.69 × 10 −3 and P = 5.31 × 10 −3 , LD : P = 0.016 and P = 2.65 × 10 −4 , respectively). Considering the traits associated with cross-ancestry heterogeneous loci, we observed enrichments for blood biomarkers (P = 5.7 × 10 −35) and physical appearance (P = 1.38 × 10 −4). This suggests that these specific phenotypic classes may present considerable cross-ancestry heterogeneity owing to large allele frequency and LD variation among worldwide populations.
Joint Genotype- and Ancestry-based Genome-wide Association Studies in Admixed Populations
2016
In Genome-Wide Association Studies (GWAS) genetic loci that influence complex traits are localized by inspecting associations between genotypes of genetic markers and the values of the trait of interest. On the other hand Admixture Mapping, which is performed in case of populations consisting of a recent mix of two ancestral groups, relies on the ancestry information at each locus (locus-specific ancestry).Recently it has been proposed to jointly model genotype and locus-specific ancestry within the framework of single marker tests. Here we extend this approach for population-based GWAS in the direction of multi marker models. A modified version of the Bayesian Information Criterion is developed for building a multi-locus model, which accounts for the differential correlation structure due to linkage disequilibrium and admixture linkage disequilibrium. Simulation studies and a real data example illustrate the advantages of this new approach compared to single-marker analysis and mod...
A tutorial on conducting genome-wide association studies: Quality control and statistical analysis
International Journal of Methods in Psychiatric Research, 2018
Objectives: Genome-wide association studies (GWAS) have become increasingly popular to identify associations between single nucleotide polymorphisms (SNPs) and phenotypic traits. The GWAS method is commonly applied within the social sciences. However, statistical analyses will need to be carefully conducted and the use of dedicated genetics software will be required. This tutorial aims to provide a guideline for conducting genetic analyses. Methods: We discuss and explain key concepts and illustrate how to conduct GWAS using example scripts provided through GitHub (https://github.com/MareesAT/GWA\_tutorial/). In addition to the illustration of standard GWAS, we will also show how to apply polygenic risk score (PRS) analysis. PRS does not aim to identify individual SNPs but aggregates information from SNPs across the genome in order to provide individual-level scores of genetic risk. Results: The simulated data and scripts that will be illustrated in the current tutorial provide hands-on practice with genetic analyses. The scripts are based on PLINK, PRSice, and R, which are commonly used, freely available software tools that are accessible for novice users. Conclusions: By providing theoretical background and hands-on experience, we aim to make GWAS more accessible to researchers without formal training in the field.
genome-Wide Association Studies and Beyond
Genome-wide association studies (GWAS) provide an important avenue for undertaking an agnostic evaluation of the association between common genetic variants and risk of disease. Recent advances in our understanding of human genetic variation and the technology to measure such variation have made GWAS feasible. Over the past few years a multitude of GWAS have identified and replicated many associated variants. These findings are enriching our knowledge about the genetic basis of disease and leading some to advocate using GWA study results for genetic testing. For many of the GWA study results, however, the underlying mechanisms remain unclear and the findings explain only a limited amount of heritability. These issues may be clarified by more detailed investigations, including analyses of less common variants, sequence-level data, and environmental exposures. Such studies should help clarify the potential value of genetic testing to the public's health.
A short review on Genome-Wide Association Studies
Bioinformation
The authors state that they adhere with COPE guidelines on publishing ethics as described elsewhere at https://publicationethics.org/. The authors also undertake that they are not associated with any other third party (governmental or non-governmental agencies) linking with any form of unethical issues connecting to this publication. The authors also declare that they are not withholding any information that is misleading to the publisher in regard to this article.
Ancestry-specific association mapping in admixed populations
2015
During the last decade genome–wide association studies have proven to be a powerful approach to identifying disease-causing variants. However, for admixed populations, most current methods for association testing are based on the assumption that the effect of a genetic variant is the same regardless of its ancestry. This is a reasonable assumption for a causal variant, but may not hold for the genetic variants that are tested in genome–wide association studies, which are usually not causal. The effects of non-causal genetic variants depend on how strongly their presence correlate with the presence of the causal variant, which may vary between ancestral populations because of different linkage disequilibrium patterns and allele frequencies.Motivated by this, we here introduce a new statistical method for association testing in recently admixed populations, where the effect size is allowed to depend on the ancestry of a given allele. Our method does not rely on accurate inference of l...
Genome-wide association studies: a primer
Psychological Medicine, 2009
There have been nearly 400 genome-wide association studies (GWAS) published since 2005. The GWAS approach has been exceptionally successful in identifying common genetic variants that predispose to a variety of complex human diseases and biochemical and anthropometric traits. Although this approach is relatively new, there are many excellent reviews of different aspects of the GWAS method. Here, we provide a primer, an annotated overview of the GWAS method with particular reference to psychiatric genetics. We dissect the GWAS methodology into its components and provide a brief description with citations and links to reviews that cover the topic in detail.