Synthetic associations created by rare variants do not explain most GWAS results - PubMed (original) (raw)

Comment

Synthetic associations created by rare variants do not explain most GWAS results

Naomi R Wray et al. PLoS Biol. 2011.

No abstract available

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1. LD between causal and genotyped SNPs and synthetic association.

SNPs 1–10 are independent SNPs in a short chromosomal region, with population frequencies indicated by the values in the box. Rare mutations tend to be younger than common mutations. A mutation event in the region creates causal variant C1. C1 has a higher probability of arising on the major allele (dark) of any SNP than the minor allele (light). However, in the absence of recombination, the highest associated SNP will be the one where C1 is coupled (see Box 2) with the SNP allele of lowest frequency, SNP 3; recombination between the SNP and the causal variant could break down this synthetic association. An independent mutation event in the region gives rise to a second causal SNP, C2. Again C2 has higher probability of arising on the major allele of each SNP. If C2 had been the only mutation in the region then SNP 10 would be the most highly associated, as the coupled allele has lowest frequency. However, when both events arise in the same region, the associations at SNPs 3 and 10 are partially masked as they carry risk variants on both their alleles. C1 and C2 arise on the same background allele for many SNPs, but SNP 8 has the allele of lowest frequency that harbours both risk alleles. In the absence of recombination, and depending on effect size, the highest association might be with SNP 8, rather than SNPs 3 or 10. Individuals are very unlikely to carry both C1 and C2. As more causal variants arise in the region, the most associated SNP will be the one with a detectable difference in the contribution to risk from the risk alleles harboured on each allele. Other representations of synthetic association could be viewed in parallel with this representation ,,.

Figure 2. Frequency distributions of a) the risk allele frequency of the most associated SNPs listed in the GWAS Catalog for the diseases in Table 3.

b) MAF of all SNPs simulated under the coalescence model, c) MAF of SNPs used in analyses to be representative of SNPs included in GWAS. d–f) Coupled allele of most associated SNP from simulations of 1, 9, or 36 causal variants in a 100 kb region.

Figure 3. Minimum fold increase in genetic variance at single rare causal locus given the frequency of the risk allele at the genotyped associated locus.

The minimum fold increase is calculated as 1/r 2, with r 2 calculated as the maximum r 2 given the frequency of the trait increasing allele at the genotyped SNP and the frequency of the causal allele (see Box 2).

Figure 4. Polygenic analyses following the International Schizophrenia Consortium .

a) The original results for polygenic score analysis in the ISC, when stratified by quintile of risk-increasing allele frequency (Q1 being the lowest risk-increasing allele frequency, Q5 the most common; the range is between 0.02 and 0.98). b) We repeated these analyses on simulated data, generated under a “rare variant only” model and using the same simulation procedure as Dickson et al., assuming that risk loci harbor 9 causal variants, GRR = 4, MAF 0.005–0.02). The pile-up of signal in the lower quintiles, which is expected under Dickson et al.'s model, is clearly not consistent with the observed ISC results. In the simulations, SNPs are generated through a coalescent process; a subset of SNPs is selected as “genotyped” to represent the marker density, frequency distribution and LD profile observed in the original ISC study (which has properties that are typical of most GWAS, including the under-representation of low frequency variants). The y axis is the –log10P from the logistic regression of case-control status on profile score in an independent “target” case-control sample using a score calculated as the number of alleles identified as associated (with p-value less than a threshold pT) in the discovery case-control sample association analysis, scaled within each figure as so that the maximum value observed for five significance thresholds (pT = 0.1, 0.2, 0.3, 0.4, and 0.5, plotted left to right in each quintile) is scaled to 1 and the minimum is scaled to zero.

Comment in

The importance of synthetic associations will only be resolved empirically.
Goldstein DB. Goldstein DB. PLoS Biol. 2011 Jan 18;9(1):e1001008. doi: 10.1371/journal.pbio.1001008. PLoS Biol. 2011. PMID: 21267066 Free PMC article. No abstract available.

Comment on

Rare variants create synthetic genome-wide associations.
Dickson SP, Wang K, Krantz I, Hakonarson H, Goldstein DB. Dickson SP, et al. PLoS Biol. 2010 Jan 26;8(1):e1000294. doi: 10.1371/journal.pbio.1000294. PLoS Biol. 2010. PMID: 20126254 Free PMC article.

Cited by

A systematic evaluation of the performance and properties of the UK Biobank Polygenic Risk Score (PRS) Release.
Thompson DJ, Wells D, Selzam S, Peneva I, Moore R, Sharp K, Tarran WA, Beard EJ, Riveros-Mckay F, Giner-Delgado C, Palmer D, Seth P, Harrison J, Futema M; Genomics England Research Consortium; McVean G, Plagnol V, Donnelly P, Weale ME. Thompson DJ, et al. PLoS One. 2024 Sep 18;19(9):e0307270. doi: 10.1371/journal.pone.0307270. eCollection 2024. PLoS One. 2024. PMID: 39292644 Free PMC article.
Inferring causal direction between two traits using R2 with application to transcriptome-wide association studies.
Liao H, Xue H, Pan W. Liao H, et al. Am J Hum Genet. 2024 Aug 8;111(8):1782-1795. doi: 10.1016/j.ajhg.2024.06.013. Epub 2024 Jul 24. Am J Hum Genet. 2024. PMID: 39053457
Genetic analysis of cassava brown streak disease root necrosis using image analysis and genome-wide association studies.
Nandudu L, Strock C, Ogbonna A, Kawuki R, Jannink JL. Nandudu L, et al. Front Plant Sci. 2024 Mar 18;15:1360729. doi: 10.3389/fpls.2024.1360729. eCollection 2024. Front Plant Sci. 2024. PMID: 38562560 Free PMC article.
kGWASflow: a modular, flexible, and reproducible Snakemake workflow for k-mers-based GWAS.
Corut AK, Wallace JG. Corut AK, et al. G3 (Bethesda). 2023 Dec 29;14(1):jkad246. doi: 10.1093/g3journal/jkad246. G3 (Bethesda). 2023. PMID: 37976215 Free PMC article.
Refining the genetic risk of breast cancer with rare haplotypes and pattern mining.
Letsou W, Wang F, Moon W, Im C, Sapkota Y, Robison LL, Yasui Y. Letsou W, et al. Life Sci Alliance. 2023 Aug 4;6(10):e202302183. doi: 10.26508/lsa.202302183. Print 2023 Oct. Life Sci Alliance. 2023. PMID: 37541849 Free PMC article.

References

1. Hindorff L. A, Junkins H. A, Hall P. N, Mehta J. P, Manolio T. A. 2009. A catalog of published genome-wide association studies. Available: http://www.genome.gov/gwastudies. Accessed 7 December 2010. - PubMed
1. Maher B. Personal genomes: the case of the missing heritability. Nature. 2008;456:18–21. - PubMed
1. Manolio T. A, Collins F. S, Cox N. J, Goldstein D. B, Hindorff L. A, et al. Finding the missing heritability of complex diseases. Nature. 2009;461:747–753. - PMC - PubMed
1. Dickson S. P, Wang K, Krantz I, Hakonarson H, Goldstein D. B. Rare variants create synthetic genome-wide associations. PLoS Biol. 2011;9:e1001008. doi: 10.1371/journal.pbio.1001008. - DOI - PMC - PubMed
1. Wang K, Dickson S. P, Stolle C. A, Krantz I. D, Goldstein D. B, et al. Interpretation of association signals and identification of causal variants from genome-wide association studies. Am J of Hum Genet. 2010;86:730–742. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources
Other Literature Sources
- H1 Connect

Synthetic associations created by rare variants do not explain most GWAS results - PubMed (original) (raw)