Genetic variation influencing DNA methylation provides insights into molecular mechanisms regulating genomic function (original) (raw)
Data availability
Summary statistics for the 11.2 million SNP–CpG pairs reaching genome-wide significance are available at https://zenodo.org/record/5196216#.YRZ3TfJxeUk. ChIP–seq data for ZNF333 are available through the NCBI SRA (accession code SRP284104). Raw genotype, methylation and expression data can be made available upon reasonable request by the authors. Controlled data access to data from the KORA cohort can be obtained through https://epi.helmholtz-muenchen.de. The web links for the publicly available datasets used in the study are as follows: PhenoScanner version 2 (http://www.phenoscanner.medschl.cam.ac.uk), GWAS catalog (https://www.ebi.ac.uk/gwas/docs/file-downloads), meQTL and eQTM data from Bonder et al 2017 (ref. 14). (https://molgenis26.gcc.rug.nl/downloads/biosqtlbrowser/2015_09_02_Primary_cis_meQTLsFDR0.05-ProbeLevel.zip, https://molgenis26.gcc.rug.nl/downloads/biosqtlbrowser/2015_09_02_trans_meQTLsFDR0.05-CpGLevel.txt, https://molgenis26.gcc.rug.nl/downloads/biosqtlbrowser/2015_09_02_cis_eQTMsFDR0.05-CpGLevel.txt), GTEx version 6 eQTL results (https://storage.googleapis.com/gtex_analysis_v6/single_tissue_eqtl_data/GTEx_Analysis_V6_eQTLs.tar.gz), eQTLGen cis eQTL results (https://molgenis26.gcc.rug.nl/downloads/eqtlgen/cis-eqtl/cis-eQTLs_full_20180905.txt.gz), TWAS hub (http://twas-hub.org/genes/UBASH3B/), GWAS summary statistics of 114 traits for colocalization analysis (https://zenodo.org/record/3629742), ChIP–seq binding sites (http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeRegTfbsClustered/wgEncodeRegTfbsClusteredWithCellsV3.bed.gz, http://tagc.univ-mrs.fr/remap/download/All/filPeaks_public.bed.gz), chromHMM states (http://egg2.wustl.edu/roadmap/data/byFileType/chromhmmSegmentations/ChmmModels/coreMarks/jointModel/final/all.mnemonics.bedFiles.tgz), Hi-C data (EGAD00001003106), PPIs (http://string90.embl.de/newstring_download/protein.links.detailed.v9.0.txt.gz). Source data are provided with this paper.
Code availability
References
- Bird, A. Perceptions of epigenetics. Nature 447, 396–398 (2007).
Article CAS PubMed Google Scholar - Schubeler, D. Function and information content of DNA methylation. Nature 517, 321–326 (2015).
Article CAS PubMed Google Scholar - Parry, A., Rulands, S. & Reik, W. Active turnover of DNA methylation during cell fate decisions. Nat. Rev. Genet. 22, 59–66 (2021).
Article CAS PubMed Google Scholar - Jaenisch, R. & Bird, A. Epigenetic regulation of gene expression: how the genome integrates intrinsic and environmental signals. Nat. Genet. 33, 245–254 (2003).
Article CAS PubMed Google Scholar - Chambers, J. C. et al. Epigenome-wide association of DNA methylation markers in peripheral blood from Indian Asians and Europeans with incident type 2 diabetes: a nested case–control study. Lancet Diabetes Endocrinol. 3, 526–534 (2015).
Article CAS PubMed PubMed Central Google Scholar - Marioni, R. E. et al. DNA methylation age of blood predicts all-cause mortality in later life. Genome Biol. 16, 25 (2015).
Article PubMed PubMed Central Google Scholar - van der Harst, P., de Windt, L. J. & Chambers, J. C. Translational perspective on epigenetics in cardiovascular disease. J. Am. Coll. Cardiol. 70, 590–606 (2017).
Article PubMed PubMed Central Google Scholar - Wahl, S. et al. Epigenome-wide association study of body mass index, and the adverse outcomes of adiposity. Nature 541, 81–86 (2017).
Article CAS PubMed Google Scholar - Zhang, Y. et al. DNA methylation signatures in peripheral blood strongly predict all-cause mortality. Nat. Commun. 8, 14617 (2017).
Article CAS PubMed PubMed Central Google Scholar - Sugiura, M. et al. Epigenetic modifications in prostate cancer. Int. J. Urol. 28, 140–149 (2020).
Article PubMed Google Scholar - Blokhin, I. O., Khorkova, O., Saveanu, R. V. & Wahlestedt, C. Molecular mechanisms of psychiatric diseases. Neurobiol. Dis. 146, 105136 (2020).
Article CAS PubMed Google Scholar - Darwiche, N. Epigenetic mechanisms and the hallmarks of cancer: an intimate affair. Am. J. Cancer Res. 10, 1954–1978 (2020).
CAS PubMed PubMed Central Google Scholar - Bonder, M. J. et al. Genetic and epigenetic regulation of gene expression in fetal and adult human livers. BMC Genomics 15, 860 (2014).
Article PubMed PubMed Central Google Scholar - Bonder, M. J. et al. Disease variants alter transcription factor levels and methylation of their binding sites. Nat. Genet. 49, 131–138 (2017).
Article CAS PubMed Google Scholar - Gibbs, J. R. et al. Abundant quantitative trait loci exist for DNA methylation and gene expression in human brain. PLoS Genet. 6, e1000952 (2010).
Article PubMed PubMed Central Google Scholar - Grundberg, E. et al. Global analysis of DNA methylation variation in adipose tissue from twins reveals links to disease-associated variants in distal regulatory elements. Am. J. Hum. Genet. 93, 876–890 (2013).
Article CAS PubMed PubMed Central Google Scholar - Gutierrez-Arcelus, M. et al. Passive and active DNA methylation and the interplay with genetic variation in gene regulation. eLife 2, e00523 (2013).
Article PubMed PubMed Central Google Scholar - Lemire, M. et al. Long-range epigenetic regulation is conferred by genetic variation located at thousands of independent loci. Nat. Commun. 6, 6326 (2015).
Article CAS PubMed Google Scholar - Huan, T. et al. Genome-wide identification of DNA methylation QTLs in whole blood highlights pathways for cardiovascular disease. Nat. Commun. 10, 4267 (2019).
Article PubMed PubMed Central Google Scholar - Hannon, E. et al. Leveraging DNA-methylation quantitative-trait loci to characterize the relationship between methylomic variation, gene expression, and complex traits. Am. J. Hum. Genet. 103, 654–665 (2018).
Article CAS PubMed PubMed Central Google Scholar - Gaunt, T. R. et al. Systematic identification of genetic influences on methylation across the human life course. Genome Biol. 17, 61 (2016).
Article PubMed PubMed Central Google Scholar - McRae, A. F. et al. Identification of 55,000 replicated DNA methylation QTL. Sci. Rep. 8, 17605 (2018).
Article PubMed PubMed Central Google Scholar - Hop, P. J. et al. Genome-wide identification of genes regulating DNA methylation using genetic anchors for causal inference. Genome Biol. 21, 220 (2020).
Article CAS PubMed PubMed Central Google Scholar - Peterson, R. E. et al. Genome-wide association studies in ancestrally diverse populations: opportunities, methods, pitfalls, and recommendations. Cell 179, 589–603 (2019).
Article CAS PubMed PubMed Central Google Scholar - Bell, C. G. et al. Obligatory and facilitative allelic variation in the DNA methylome within common disease-associated loci. Nat. Commun. 9, 8 (2018).
Article PubMed PubMed Central Google Scholar - Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48, 481–487 (2016).
Article CAS PubMed Google Scholar - Brenner, C. et al. Myc represses transcription through recruitment of DNA methyltransferase corepressor. EMBO J. 24, 336–346 (2005).
Article CAS PubMed Google Scholar - Esteve, P. O., Chin, H. G. & Pradhan, S. Human maintenance DNA (cytosine-5)-methyltransferase and p53 modulate expression of p53-repressed promoters. Proc. Natl Acad. Sci. USA 102, 1000–1005 (2005).
Article PubMed Central Google Scholar - Shlyueva, D., Stampfel, G. & Stark, A. Transcriptional enhancers: from properties to genome-wide predictions. Nat. Rev. Genet. 15, 272–286 (2014).
Article CAS PubMed Google Scholar - Visel, A., Rubin, E. M. & Pennacchio, L. A. Genomic views of distant-acting enhancers. Nature 461, 199–205 (2009).
Article CAS PubMed PubMed Central Google Scholar - Javierre, B. M. et al. Lineage-specific genome architecture links enhancers and non-coding disease variants to target gene promoters. Cell 167, 1369–1384 (2016).
Article CAS PubMed PubMed Central Google Scholar - Rao, S. S. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).
Article CAS PubMed PubMed Central Google Scholar - Liu, Y., Toh, H., Sasaki, H., Zhang, X. & Cheng, X. An atomic model of Zfp57 recognition of CpG methylation within a specific DNA sequence. Genes Dev. 26, 2374–2379 (2012).
Article CAS PubMed PubMed Central Google Scholar - Shi, H. et al. ZFP57 regulation of transposable elements and gene expression within and beyond imprinted domains. Epigenetics Chromatin 12, 49 (2019).
Article PubMed PubMed Central Google Scholar - Yengo, L. et al. Meta-analysis of genome-wide association studies for height and body mass index in approximately 700000 individuals of European ancestry. Hum. Mol. Genet. 27, 3641–3649 (2018).
Article CAS PubMed PubMed Central Google Scholar - Lee, S. T. et al. Protein tyrosine phosphatase UBASH3B is overexpressed in triple-negative breast cancer and promotes invasion and metastasis. Proc. Natl Acad. Sci. USA 110, 11121–11126 (2013).
Article CAS PubMed PubMed Central Google Scholar - Pulit, S. L. et al. Meta-analysis of genome-wide association studies for body fat distribution in 694 649 individuals of European ancestry. Hum. Mol. Genet. 28, 166–174 (2019).
Article CAS PubMed Google Scholar - Kichaev, G. et al. Leveraging polygenic functional enrichment to improve GWAS power. Am. J. Hum. Genet. 104, 65–75 (2019).
Article CAS PubMed Google Scholar - Zhu, Z. et al. Shared genetic and experimental links between obesity-related traits and asthma subtypes in UK Biobank. J. Allergy Clin. Immunol. 145, 537–549 (2020).
Article CAS PubMed Google Scholar - Richardson, T. G. et al. Evaluating the relationship between circulating lipoprotein lipids and apolipoproteins with risk of coronary heart disease: a multivariable Mendelian randomisation analysis. PLoS Med. 17, e1003062 (2020).
Article PubMed PubMed Central Google Scholar - Konieczna, J., Sanchez, J., Palou, M., Pico, C. & Palou, A. Blood cell transcriptomic-based early biomarkers of adverse programming effects of gestational calorie restriction and their reversibility by leptin supplementation. Sci. Rep. 5, 9088 (2015).
Article CAS PubMed PubMed Central Google Scholar - Mancuso, N. et al. Integrating gene expression with summary association statistics to identify genes associated with 30 complex traits. Am. J. Hum. Genet. 100, 473–487 (2017).
Article CAS PubMed PubMed Central Google Scholar - Okada, Y. et al. Genetics of rheumatoid arthritis contributes to biology and drug discovery. Nature 506, 376–381 (2014).
Article CAS PubMed Google Scholar - Emery, P. et al. IL-6 receptor inhibition with tocilizumab improves treatment outcomes in patients with rheumatoid arthritis refractory to anti-tumour necrosis factor biologicals: results from a 24-week multicentre randomised placebo-controlled trial. Ann. Rheum. Dis. 67, 1516–1523 (2008).
Article CAS PubMed Google Scholar - Navarro-Millan, I., Singh, J. A. & Curtis, J. R. Systematic review of tocilizumab for rheumatoid arthritis: a new biologic agent targeting the interleukin-6 receptor. Clin. Ther. 34, 788–802 (2012).
Article CAS PubMed PubMed Central Google Scholar - Wen, X., Pique-Regi, R. & Luca, F. Integrating molecular QTL data into genome-wide genetic association analysis: probabilistic assessment of enrichment and colocalization. PLoS Genet. 13, e1006646 (2017).
Article PubMed PubMed Central Google Scholar - Burnichon, N. et al. MAX mutations cause hereditary and sporadic pheochromocytoma and paraganglioma. Clin. Cancer Res. 18, 2828–2837 (2012).
Article CAS PubMed Google Scholar - Li, H. et al. Novel treatment of hypertension by specifically targeting E2F for restoration of endothelial dihydrofolate reductase and eNOS function under oxidative stress. Hypertension 73, 179–189 (2019).
Article CAS PubMed Google Scholar - Burstein, E. et al. COMMD proteins, a novel family of structural and functional homologs of MURR1. J. Biol. Chem. 280, 22222–22232 (2005).
Article CAS PubMed Google Scholar - Astle, W. J. et al. The allelic landscape of human blood cell trait variation and links to common complex disease. Cell 167, 1415–1429 (2016).
Article CAS PubMed PubMed Central Google Scholar - Suhail, A. et al. DeSUMOylase SENP7-mediated epithelial signaling triggers intestinal inflammation via expansion of γδ T cells. Cell Rep. 29, 3522–3538 (2019).
Article CAS PubMed Google Scholar - Jing, Z., Liu, Y., Dong, M., Hu, S. & Huang, S. Identification of the DNA binding element of the human ZNF333 protein. J. Biochem. Mol. Biol. 37, 663–670 (2004).
CAS PubMed Google Scholar - Chen, M. H. et al. Trans-ethnic and ancestry-specific blood-cell genetics in 746,667 individuals from 5 global populations. Cell 182, 1198–1213 (2020).
Article CAS PubMed PubMed Central Google Scholar - Nedelec, Y. et al. Genetic ancestry and natural selection drive population differences in immune responses to pathogens. Cell 167, 657–669 (2016).
Article CAS PubMed Google Scholar - Joehanes, R. et al. Epigenetic signatures of cigarette smoking. Circ. Cardiovasc. Genet. 9, 436–447 (2016).
Article CAS PubMed PubMed Central Google Scholar - Singmann, P. et al. Characterization of whole-genome autosomal differences of DNA methylation between men and women. Epigenetics Chromatin 8, 43 (2015).
Article PubMed PubMed Central Google Scholar - Zeilinger, S. et al. Tobacco smoking leads to extensive genome-wide changes in DNA methylation. PLoS ONE 8, e63812 (2013).
Article CAS PubMed PubMed Central Google Scholar - Giri, A. K. et al. DNA methylation profiling reveals the presence of population-specific signatures correlating with phenotypic characteristics. Mol. Genet. Genomics 292, 655–662 (2017).
Article CAS PubMed Google Scholar - Breeze, C. E. et al. eFORGE: a tool for identifying cell type-specific signal in epigenomic data. Cell Rep. 17, 2137–2150 (2016).
Article CAS PubMed PubMed Central Google Scholar - Westra, H. J. et al. Cell specific eQTL analysis without sorting cells. PLoS Genet. 11, e1005223 (2015).
Article PubMed PubMed Central Google Scholar - Guan, W. et al. Genome-wide association study of plasma _N_6 polyunsaturated fatty acids within the cohorts for heart and aging research in genomic epidemiology consortium. Circ. Cardiovasc. Genet. 7, 321–331 (2014).
Article CAS PubMed PubMed Central Google Scholar - Shin, S. Y. et al. An atlas of genetic influences on human blood metabolites. Nat. Genet. 46, 543–550 (2014).
Article CAS PubMed PubMed Central Google Scholar - Kamat, M. A. et al. PhenoScanner V2: an expanded tool for searching human genotype–phenotype associations. Bioinformatics 35, 4851–4853 (2019).
Article CAS PubMed PubMed Central Google Scholar - Gelfand, E. W. & Dakhama, A. CD8+ T lymphocytes and leukotriene B4: novel interactions in the persistence and progression of asthma. J. Allergy Clin. Immunol. 117, 577–582 (2006).
Article CAS PubMed Google Scholar - Cho, S. H., Stanciu, L. A., Holgate, S. T. & Johnston, S. L. Increased interleukin-4, interleukin-5, and interferon-γ in airway CD4+ and CD8+ T cells in atopic asthma. Am. J. Respir. Crit. Care Med. 171, 224–230 (2005).
Article PubMed Google Scholar - Zhou, X. & Stephens, M. Genome-wide efficient mixed-model analysis for association studies. Nat. Genet. 44, 821–824 (2012).
Article CAS PubMed PubMed Central Google Scholar - Ernst, J. & Kellis, M. Discovery and characterization of chromatin states for systematic annotation of the human genome. Nat. Biotechnol. 28, 817–825 (2010).
Article CAS PubMed PubMed Central Google Scholar - Roadmap Epigenomics Consortium et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
Article Google Scholar - Shabalin, A. A. Matrix eQTL: ultra fast eQTL analysis via large matrix operations. Bioinformatics 28, 1353–1358 (2012).
Article CAS PubMed PubMed Central Google Scholar - GTEx Consortium et al. Genetic effects on gene expression across human tissues. Nature 550, 204–213 (2017).
Article PubMed Central Google Scholar - Kim, K. A. et al. Environmental risk factors and comorbidities of primary biliary cholangitis in Korea: a case–control study. Korean J. Intern. Med. 36, 313–321 (2020).
Article PubMed PubMed Central Google Scholar - GTEx Consortium The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).
Article Google Scholar - Staley, J. R. et al. PhenoScanner: a database of human genotype–phenotype associations. Bioinformatics 32, 3207–3209 (2016).
Article CAS PubMed PubMed Central Google Scholar - Griffon, A. et al. Integrative analysis of public ChIP–seq experiments reveals a complex multi-cell regulatory landscape. Nucleic Acids Res. 43, e27 (2015).
Article PubMed Google Scholar - Franceschini, A. et al. STRING v9.1: protein–protein interaction networks, with increased coverage and integration. Nucleic Acids Res. 41, D808–D815 (2013).
Google Scholar - Suthram, S., Beyer, A., Karp, R. M., Eldar, Y. & Ideker, T. eQED: an efficient method for interpreting eQTL associations using protein networks. Mol. Syst. Biol. 4, 162 (2008).
Article PubMed PubMed Central Google Scholar - Tu, Z., Wang, L., Arbeitman, M. N., Chen, T. & Sun, F. An integrative approach for causal gene identification and gene regulatory pathway inference. Bioinformatics 22, e489–96 (2006).
Article CAS PubMed Google Scholar - Haghverdi, L., Buettner, F. & Theis, F. J. Diffusion maps for high-dimensional single-cell analysis of differentiation data. Bioinformatics 31, 2989–2998 (2015).
Article CAS PubMed Google Scholar - Haghverdi, L., Buttner, M., Wolf, F. A., Buettner, F. & Theis, F. J. Diffusion pseudotime robustly reconstructs lineage branching. Nat. Methods 13, 845–848 (2016).
Article CAS PubMed Google Scholar - Schramm, K. et al. Mapping the genetic architecture of gene regulation in whole blood. PLoS ONE 9, e93844 (2014).
Article PubMed PubMed Central Google Scholar - Benjamini, Y., Drai, D., Elmer, G., Kafkafi, N. & Golani, I. Controlling the false discovery rate in behavior genetics research. Behav. Brain Res. 125, 279–284 (2001).
Article CAS PubMed Google Scholar - Buniello, A. et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47, D1005–D1012 (2019).
Article CAS PubMed Google Scholar - Taylor-Weiner, A. et al. Scaling computational genomics to millions of individuals with GPUs. Genome Biol. 20, 228 (2019).
Article PubMed PubMed Central Google Scholar - Hawe, J. S., Heinig, M. & Loh, M. Code for the analyses described in Hawe et al. Nature Genetics. Zenodo https://doi.org/10.5281/zenodo.5529828 (2021).
Acknowledgements
The KORA study was initiated and financed by the Helmholtz Zentrum München (German Research Center for Environmental Health), which is funded by the German Federal Ministry of Education and Research (BMBF) and by the state of Bavaria. KORA research was supported within the Munich Center of Health Sciences (MC-Health), Ludwig-Maximilians-Universität, as part of LMUinnovativ. The work was supported by the German Federal Ministry of Education and Research (BMBF) within the framework of the EU Joint Programming Initiative ‘a Healthy Diet for a Healthy Life’ (DIMENSION grant number 01EA1902A). The work was further supported by the Bavarian State Ministry of Health and Care through the research project DigiMed Bayern (https://www.digimed-bayern.de/). The German Diabetes Center (DDZ) is supported by the Ministry of Culture and Science of the State of North Rhine–Westphalia and the German Federal Ministry of Health. This study was supported in part by a grant from the German Federal Ministry of Education and Research to the German Center for Diabetes Research (DZD). The LOLIPOP study is supported by the National Institute for Health Research (NIHR) Comprehensive Biomedical Research Centre Imperial College Healthcare NHS Trust, the British Heart Foundation (SP/04/002), the Medical Research Council (G0601966, G0700931), the Wellcome Trust (084723/Z/08/Z), the NIHR (RP-PG-0407-10371), European Union FP7 (EpiMigrant, 279143) and European Union Horizon 2020 (iHealth-T2D, 643774). B.C.L. is supported by the Imperial College Junior Research Fellowship scheme as well as an Academy of Medical Sciences Springboard award. J.C.C. is also supported by the Singapore NMRC (NMRC/STaR/0028/2017). We thank the participants and research staff who made the study possible. For the Northern Finnish Birth Cohort studies, M. Wielscher was supported by the European Union’s Horizon 2020 research and innovation program (grant 633212). NFBC1966 received financial support from the Academy of Finland (grants 104781, 120315, 129269, 1114194 and 24300796, Center of Excellence in Complex Disease Genetics and SALVE), University Hospital Oulu, Biocenter, University of Oulu, Finland (75617), NHLBI grant 5R01HL087679-02 through the STAMPEED program (1RL1MH083268-01), the NIH–NIMH (5R01MH63706:02), the ENGAGE project and grant agreement HEALTH-F4-2007-201413, EU FP7 EurHEALTHAgeing (277849), the Medical Research Council, UK (G0500539, G0600705, G1002319, PrevMetSyn/SALVE) and an MRC Centenary Early Career Award. NFBC1986 received financial support from EU QLG1-CT-2000-01643 (EUROBLCS) grant E51560, NorFA grant nos. 731, 20056 and 30167 and USA/NIHH 2000 G DF682 grant 50945. The NFBC programs are also funded by the H2020-633595 DynaHEALTH action, the Academy of Finland Exposomic, Genomic and Epigenomic Approach to Prediction of Metabolic and Cardiorespiratory Function and Ill-Health project (285547) and the EU H2020 ALEC project (grant agreement 633212). The MuTHER study was funded by the WT (081917/Z/07/Z). TwinsUK was funded by the WT and the European Community’s Seventh Framework Programme (FP7/2007-2013). The study also received support from the NIHR Clinical Research Facility at Guy’s and St. Thomas’ and King’s College London. Analysis was funded by British Heart Foundation grant RG/14/5/30893 to P.D. and forms part of the research themes contributing to the translational research portfolio of the Barts Cardiovascular Biomedical Research Unit, which is funded by the NIHR. The Saguenay Youth Study has been funded by the Canadian Institutes of Health Research (T.P., Z.P.), the Heart and Stroke Foundation of Canada (Z.P.) and the Canadian Foundation for Innovation (Z.P.). We acknowledge G. Möller and J. Adamski (Helmholtz Center Munich) for their support in the IP–MS transfection experiment. We used data generated by the PCHI-C Consortium31, funded by the UK NIHR, the Medical Research Council (MR/L007150/1) and the Biotechnology and Biological Research Council (BB/J004480/1).
Author information
Author notes
- These authors contributed equally: Johann S. Hawe, Rory Wilson, Katharina Schmid.
- These authors jointly supervised this work: Jaspal S. Kooner, Marie Loh, Matthias Heinig, Christian Gieger, Melanie Waldenberger, John C. Chambers.
Authors and Affiliations
- Institute of Computational Biology, Deutsches Forschungszentrum für Gesundheit und Umwelt, Helmholtz Zentrum München, Neuherberg, Germany
Johann S. Hawe, Katharina T. Schmid, Eudes G. V. Barbosa & Matthias Heinig - Department of Informatics, Technical University of Munich, Garching bei München, Germany
Johann S. Hawe, Katharina T. Schmid & Matthias Heinig - Institute of Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
Rory Wilson, Brigitte Kühnel, Clemens Baumbach, Liliane Pfeiffer, Pamela R. Matías-García, Harald Grallert, Annette Peters, Christian Gieger & Melanie Waldenberger - Research Unit Molecular Epidemiology, Helmholtz Zentrum München, German Research Centre for Environmental Health, Neuherberg, Germany
Rory Wilson, Brigitte Kühnel, Clemens Baumbach, Liliane Pfeiffer, Pamela R. Matías-García, Harald Grallert, Annette Peters, Christian Gieger & Melanie Waldenberger - Lee Kong Chian School of Medicine, Singapore, Singapore
Li Zhou, Lakshmi Narayanan Lakshmanan, Yik Weng Yew, Marie Loh & John C. Chambers - Department of Epidemiology and Biostatistics, Imperial College London, London, UK
Benjamin C. Lehne, William R. Scott, Matthias Wielscher, Ville Karhunen, Weihua Zhang, Marjo-Riitta Jarvelin, Marie Loh & John C. Chambers - Genome Institute of Singapore, Singapore, Singapore
Dominic P. Lee, Matias I. Autio, Wilson L. W. Tan & Roger S. Y. Foo - Centre for Genomic Health, Queen Mary University of London, London, UK
Eirini Marouli & Panos Deloukas - William Harvey Research Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, UK
Eirini Marouli, Stephane Bourgeois, Josine L. Min & Panos Deloukas - Departments of Physiology and Nutritional Sciences, University of Toronto, Toronto, Ontario, Canada
Manon Bernard, Jean Shin & Zdenka Pausova - The Hospital for Sick Children, University of Toronto, Toronto, Ontario, Canada
Manon Bernard, Jean Shin & Zdenka Pausova - Cardiovascular Research Institute, National University Health Systems, National University of Singapore, Singapore, Singapore
Matias I. Autio & Roger S. Y. Foo - German Center for Diabetes Research (DZD), partner site Düsseldorf, Düsseldorf, Germany
Christian Herder, Wolfgang Rathmann & Michael Roden - Institute for Clinical Diabetology, German Diabetes Center, Leibniz Center for Diabetes Research at Heinrich Heine University, Düsseldorf, Germany
Christian Herder & Michael Roden - Division of Endocrinology and Diabetology, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany
Christian Herder & Michael Roden - Center for Life Course Health Research, Faculty of Medicine, University of Oulu, Oulu, Finland
Ville Karhunen, Sylvain Sebert & Marjo-Riitta Jarvelin - Institute of Human Genetics, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
Thomas Meitinger - Institute of Human Genetics, Technical University Munich, Munich, Germany
Thomas Meitinger & Holger Prokisch - Institute of Neurogenomics, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
Holger Prokisch - Institute for Biometrics and Epidemiology, German Diabetes Center, Leibniz Center for Diabetes Research at Heinrich Heine University Düsseldorf, Düsseldorf, Germany
Wolfgang Rathmann - Biocenter Oulu, University of Oulu, Oulu, Finland
Sylvain Sebert & Marjo-Riitta Jarvelin - Department for Genomics of Common Diseases, School of Public Health, Imperial College London, London, UK
Sylvain Sebert - Chair of Genetic Epidemiology, IBE, Faculty of Medicine, LMU Munich, Munich, Germany
Konstantin Strauch - Institute of Genetic Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
Konstantin Strauch - Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center, Johannes Gutenberg University, Mainz, Germany
Konstantin Strauch - Department of Cardiology, Ealing Hospital, London North West Healthcare NHS Trust, Southall, UK
Weihua Zhang - Research Unit Protein Science, Helmholtz Zentrum München, German Research Centre for Environmental Health, Munich, Germany
Stefanie M. Hauck & Juliane Merl-Pham - German Center for Diabetes Research (DZD), Munich-Neuherberg, Germany
Harald Grallert - Hannover Unified Biobank, Hannover Medical School, Hannover, Germany
Thomas Illig - Institute for Human Genetics, Hannover Medical School, Hannover, Germany
Thomas Illig - German Research Center for Cardiovascular Disease (DZHK), partner site Munich Heart Alliance, Hannover, Germany
Annette Peters & Melanie Waldenberger - Centre Hospitalier Universitaire Sainte-Justine, University of Montreal, Montreal, Canada
Tomas Paus - Unit of Primary Care, Oulu University Hospital, Oulu, Finland
Marjo-Riitta Jarvelin - National Heart and Lung Institute, Imperial College London, London, UK
Jaspal S. Kooner - Department of Twin Research and Genetic Epidemiology, King’s College London, London, UK
Kourosh R. Ahmadi, Veronique Bataille, Jordana T. Bell, Daniel Glass, Kerrin S. Small, Tim D. Spector & Gabriela Surdulescu - Department of Informatics, School of Natural and Mathematical Sciences, King’s College London, Strand, London, UK
Chrysanthi Ainali & Sophia Tsoka - Oxford Centre for Diabetes, Endocrinology & Metabolism, University of Oxford, Churchill Hospital, Headington, Oxford, UK
Amy Barrett, Neelam Hassanali, Mark I. McCarthy & Mary E. Travers - Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland
Alfonso Buil, Emmanouil T. Dermitzakis, Antigone S. Dimas, Stephen B. Montgomery & Alexandra C. Nica - Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK
Richard Durbin, Catherine Ingle, Eshwar Meduri, James Nisbet, Leopold Parts, Simon Potter, Johanna Sandling, Magdalena Sekowska, So-Youn Shin, Nicole Soranzo, Loukia Tsaprouni, Alicja Wilk & Tsun-Po Yang - Children’s Mercy Hospitals and Clinics, Kansas City, MO, USA
Elin Grundberg - Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK
Åsa K. Hedman, Cecilia M. Lindgren, Mark I. McCarthy & Krina T. Zondervan - University of Cambridge, Cambridge, UK
David Knowles - European Bioinformatics Institute, Hinxton, UK
Maria Krestyaninova - University of Cambridge Metabolic Research Labs, Institute of Metabolic Science, Addenbrooke’s Hospital Cambridge, Cambridge, UK
Christopher E. Lowe & Stephen O’Rahilly - Cambridge NIHR Biomedical Research Centre, Addenbrooke’s Hospital, Cambridge, UK
Christopher E. Lowe & Stephen O’Rahilly - St. John’s Institute of Dermatology, King’s College London, London, UK
Paola di Meglio & Frank O. Nestle
Authors
- Johann S. Hawe
You can also search for this author inPubMed Google Scholar - Rory Wilson
You can also search for this author inPubMed Google Scholar - Katharina T. Schmid
You can also search for this author inPubMed Google Scholar - Li Zhou
You can also search for this author inPubMed Google Scholar - Lakshmi Narayanan Lakshmanan
You can also search for this author inPubMed Google Scholar - Benjamin C. Lehne
You can also search for this author inPubMed Google Scholar - Brigitte Kühnel
You can also search for this author inPubMed Google Scholar - William R. Scott
You can also search for this author inPubMed Google Scholar - Matthias Wielscher
You can also search for this author inPubMed Google Scholar - Yik Weng Yew
You can also search for this author inPubMed Google Scholar - Clemens Baumbach
You can also search for this author inPubMed Google Scholar - Dominic P. Lee
You can also search for this author inPubMed Google Scholar - Eirini Marouli
You can also search for this author inPubMed Google Scholar - Manon Bernard
You can also search for this author inPubMed Google Scholar - Liliane Pfeiffer
You can also search for this author inPubMed Google Scholar - Pamela R. Matías-García
You can also search for this author inPubMed Google Scholar - Matias I. Autio
You can also search for this author inPubMed Google Scholar - Stephane Bourgeois
You can also search for this author inPubMed Google Scholar - Christian Herder
You can also search for this author inPubMed Google Scholar - Ville Karhunen
You can also search for this author inPubMed Google Scholar - Thomas Meitinger
You can also search for this author inPubMed Google Scholar - Holger Prokisch
You can also search for this author inPubMed Google Scholar - Wolfgang Rathmann
You can also search for this author inPubMed Google Scholar - Michael Roden
You can also search for this author inPubMed Google Scholar - Sylvain Sebert
You can also search for this author inPubMed Google Scholar - Jean Shin
You can also search for this author inPubMed Google Scholar - Konstantin Strauch
You can also search for this author inPubMed Google Scholar - Weihua Zhang
You can also search for this author inPubMed Google Scholar - Wilson L. W. Tan
You can also search for this author inPubMed Google Scholar - Stefanie M. Hauck
You can also search for this author inPubMed Google Scholar - Juliane Merl-Pham
You can also search for this author inPubMed Google Scholar - Harald Grallert
You can also search for this author inPubMed Google Scholar - Eudes G. V. Barbosa
You can also search for this author inPubMed Google Scholar - Thomas Illig
You can also search for this author inPubMed Google Scholar - Annette Peters
You can also search for this author inPubMed Google Scholar - Tomas Paus
You can also search for this author inPubMed Google Scholar - Zdenka Pausova
You can also search for this author inPubMed Google Scholar - Panos Deloukas
You can also search for this author inPubMed Google Scholar - Roger S. Y. Foo
You can also search for this author inPubMed Google Scholar - Marjo-Riitta Jarvelin
You can also search for this author inPubMed Google Scholar - Jaspal S. Kooner
You can also search for this author inPubMed Google Scholar - Marie Loh
You can also search for this author inPubMed Google Scholar - Matthias Heinig
You can also search for this author inPubMed Google Scholar - Christian Gieger
You can also search for this author inPubMed Google Scholar - Melanie Waldenberger
You can also search for this author inPubMed Google Scholar - John C. Chambers
You can also search for this author inPubMed Google Scholar
Consortia
MuTHER Consortium
- Kourosh R. Ahmadi
- , Chrysanthi Ainali
- , Amy Barrett
- , Veronique Bataille
- , Jordana T. Bell
- , Alfonso Buil
- , Emmanouil T. Dermitzakis
- , Antigone S. Dimas
- , Richard Durbin
- , Daniel Glass
- , Elin Grundberg
- , Neelam Hassanali
- , Åsa K. Hedman
- , Catherine Ingle
- , David Knowles
- , Maria Krestyaninova
- , Cecilia M. Lindgren
- , Christopher E. Lowe
- , Mark I. McCarthy
- , Eshwar Meduri
- , Paola di Meglio
- , Josine L. Min
- , Stephen B. Montgomery
- , Frank O. Nestle
- , Alexandra C. Nica
- , James Nisbet
- , Stephen O’Rahilly
- , Leopold Parts
- , Simon Potter
- , Johanna Sandling
- , Magdalena Sekowska
- , So-Youn Shin
- , Kerrin S. Small
- , Nicole Soranzo
- , Tim D. Spector
- , Gabriela Surdulescu
- , Mary E. Travers
- , Loukia Tsaprouni
- , Sophia Tsoka
- , Alicja Wilk
- , Tsun-Po Yang
- & Krina T. Zondervan
Contributions
Data collection and analysis in the contributing population studies: KORA, A.P., B.K., C.G., C.H., C.B., H.P., K. Strauch, L. Pfeiffer, M. Waldenberger, M.R., R.W., T.I., T.M. and W.R.; LOLIPOP, B.C.L., J.S.K., J.C.C., W.Z. and W.R.S.; MuTHER, E. Marouli; MuTHER Consortium, P.D. and S.B.; NFBC, M.-R.J., M. Wielscher, S.S. and V.K.; SYS, J. Shin, M.B., T.P. and Z.P. Data collection and molecular follow-up analyses: ChIP–seq, D.P.L., M.I.A., R.S.Y.F. and W.L.W.T.; ChIP–MS, S.M.H., J.M.-P. and P.R.M.-G. Data analysis and writing group (alphabetical order): J.C.C., J.S.H., M.H., C.G., B.C.L., M.L., K. Schmid, M. Waldenberger and R.W.
Corresponding authors
Correspondence toJaspal S. Kooner, Marie Loh, Matthias Heinig, Christian Gieger, Melanie Waldenberger or John C. Chambers.
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information Nature Genetics thanks Charles Danko and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 2 Replication testing of meQTLs within and across ancestries.
a: Ancestry-specific replication of SNP-CpG pairs identified by genome-wide association. Effect size: change in methylation (0-1 scale) per allele copy of the SNP. Axes set to [-0.5,0.5]. b: Ancestry-specific replication, by pair proximity and MAF. Bars: no. of pairs identified in discovery in given category. Blue: replicated; yellow: not replicated. c: Cross-ancestry replication, by pair proximity and MAF. Top: discovery in EU, replication in SA; bottom: discovery in SA, replication in EU. Bars: no. of pairs identified in discovery in given category. Blue: replicated; yellow: not replicated. d: Cross-platform: replication in KORA F4 (N<=1731) of published MeDIP-seq meQTLs, by significance threshold. Blue lines: no. of replicated results (of 328); histograms: no. of replicated results over 100 randomly selected matched datasets. P-values: one-sided, no adjustment for multiple testing. See Methods for test description. EU: European; SA: South Asian; MAF: Minor allele frequency.
Extended Data Fig. 3 Variance in DNA methylation explained by meQTL SNPs.
Histograms showing the proportions of variance of DNA methylation explained by genetic variants in both populations when variants are located in cis (left), long-range cis (middle) or trans (right) of the associated CpG site. EU: European; SA: South Asian.
Extended Data Fig. 4 Analysis of proximity between meQTL SNPs and CpGs.
Panel a: Histogram showing for each CpG site the genomic distance between CpG and the closest associated SNP from the cosmopolitan set of 10,346,172 SNP-CpG pairs identified in cis (association confirmed in both Europeans and South Asians). Panel b: Boxplots showing the proportion of SNP-CpG pairs that reach genome-wide significance for different distance categories (x-axis), compared to SNP-CpG pairs on different chromosomes (trans).1,000 random samples of 10,000 SNPs were taken. P-values above each box are based on a comparison (one-sided t-test) between the proportion of SNP-CpG pairs in trans that reach significance, and the proportion that reach significance in the respective same-chromosome distance window. Boxplots show medians (center lines), first and third quartiles (lower and upper box limits, respectively), 1.5-fold interquartile ranges (whisker extents) and outliers (black circles).
Extended Data Fig. 5 Functional genomic context of meQTL SNPs and CpGs.
Panel a: Genomic overlap between chromatin state annotations (15- state model; Roadmap Epigenomics Project and SNPs/CpGs identified by genome-wide association and cross-ancestry replication testing. Results are presented as a heatmap showing the P-values for enrichment (blue) or depletion (yellow) in the respective chromatin state (two-sided t-test). P-values have been Bonferroni-adjusted for the total number of tests (see Methods for details). Panel b: Colocalisation of SNPs and CpG sites in promoter and enhancer chromatin states. The histograms show the frequency at which CpG sites that localise in promoter or enhancer chromatin states have at least one _cis_-meQTL SNP that localises to the same chromatin state. Observed (turquoise) _cis_-meQTL pairs colocalise to the same chromatin state more frequently than matched background SNP-CpG pairs (grey). Panel c: Distance distributions for cis SNP-CpG pairs 1) localising to the same state (left), 2) where one entity localises to a promoter/enhancer state and the other to neither promoter nor enhancer state (center) and 3) one entity localises to a promoter and the other to an enhancer state. Panel d: Overlap of SNP-CpG associations with chromatin contacts in primary cells. The x-axis shows the fraction of SNP-CpG pairs that localise within the same topologically associated domain (TAD, left panel) or that overlap with Hi-C contacts (center and right panels). The left panel shows localisation of long-range cis-meQTLs within the same TAD. The center panel shows the overlap of long range _cis_-meQTLs (same chromosome, distance SNP - CpG > 1Mb) with contacts from promoter capture Hi-C (PCHi-C). The right panel shows overlap of trans-meQTL with Hi-C contacts. The blue vertical arrows indicate the overlap observed in the data. The grey histograms show the distribution of the fraction of randomly sampled SNP-CpG pairs overlapping contact regions for each category.
Extended Data Fig. 6 Enrichment of meQTL SNPs and CPGs for association with gene expression.
Sentinel meQTL SNPs and CpGs are enriched for association with gene expression in cis and trans (SNPs) and only in cis (CpGs). Panel a: Results are presented as the proportion of SNPs that are observed to be associated with gene expression in cis (top row) or in trans (bottom row), stratified by proximity between SNP and CpG for the respective SNP-CpG pair (cis, long-range cis and trans from left to right). Panel b: Similarly, results are presented as the proportion of CpGs that are observed to be associated with gene expression in cis (top row) or in trans (bottom row), stratified by proximity between SNP and CpG for the respective SNP-CpG pair (cis, long-range cis and trans from left to right). Both panels: In each plot, the observed proportion (yellow boxplots) is compared to the proportion expected under the null hypothesis based on permutation testing (blue boxplots, see Methods). Inset in each figure is the P-value for comparison between observed and expected proportions (t-test). Boxplots show medians (center lines), first and third quartiles (lower and upper box limits, respectively), 1.5-fold interquartile ranges (whisker extents) and outliers (black circles). Proportions were calculated based on 100 sets of permutations with 1,000 SNPs (Panel A) or 1,000 CpGs (Panel B) in each permutation.
Extended Data Fig. 7 Enrichment of meQTL SNPs and CpGs for associations with phenotypic traits.
(A) SNPs influencing DNA methylation (left panel) and SNPs identified to be population interacting meQTL based on our cosmopolitan discovery analysis (right panel) are both enriched for association with phenotypic traits, Analysis carried out using using QTLEnrich and 114 uniformly processed GWAS summary statistics. The volcano plot shows the log2 fold enrichment of significant GWAS hits among iQTL on the x-axis and the -log10 of the P-value of the enrichment test on the y-axis. Each point represents one of 114 GWAS studies. The transparency of the fill colour indicates the false discovery rate (FDR < 5%: no transparency). (B) Sentinel CpGs are enriched for clinical and metabolic traits. We tested the Sentinel CpGs for association with 277 available clinical and metabolic traits (NMR metabolomics). We used permutation testing to generate expectations under the null hypothesis, and to determine both the magnitude and probability for enrichment. Results show strong evidence that our genetically regulated Sentinel CpGs are enriched for association with traits (enriched at P<0.05/277 for 252 phenotypes) with median enrichment 1.10 (IQR: 1.06-1.15).
Extended Data Fig. 8 CpG sites associated with _trans_-acting sentinel SNPs are enriched for location in transcription factor binding sites.
Heatmap showing the enrichment (or depletion) of CpG sites for _trans_-acting sentinel SNPs (x-axis) with the DNA binding sites of known transcription factors (y-axis). Log2 odds ratios compare the frequency of overlap for the CpGs associated with the respective SNP, compared to the background frequency of overlap for all tested CpG sites. Results are shown for the 45 sentinel SNPs that show evidence for overlap with known transcription factor binding sites (out of the 115 tested _trans_-acting sentinel SNPs with at least five associated CpG sites).
Extended Data Fig. 9 _Trans-_acting regulatory networks at the CTCF, NFKB1, REST, NFE2, MAD1L1 and ENRICH1 loci.
(a) Circos plots summarising i. genomic distribution of CpGs associated in trans [inner connections], and ii. known DNA binding sites of transcription factor encoded in cis [outer ring], for sentinel SNPs at CTCF, NFKB1, REST and NFE2 loci. Inset are observed and expected proportions of CpG sites that overlap respective DNA binding sites as available for different cell lines (see Methods). FDR < 1.17 × 10−2 for all cell lines and transcription factors. (b) Regulatory network of ERICH1 locus illustrating the connection between SNP rs10103269 (yellow rectangle) and expression of identified candidate gene ERICH1 (yellow ellipse), which is connected through protein-protein and protein-DNA interactions to methylation at _trans_-associated CpG sites (beige rectangles). Ellipses represent genes encoded at the genetic locus identified by the sentinel or that are part of the protein-protein interaction network. Genes marked with an asterisk (*) show co-expression with the candidate gene. Bold gene names indicate a strong genetic effect of the sentinel on the expression of that gene (eQTL). Fill colour of ellipses represent the random walk score (colour bar legend). The colour of edges connecting genes and CpG sites represent: i. protein-protein interactions (purple), ii. protein-DNA interactions identified by TFBS overlap (green), and iii. proximity (distance < 1 Mb) between genes and SNPs or CpG sites (blue). The thickness of edges represents correlation with gene expression (thick) or no correlation of/with gene expression (thin). Boxplot shows the effect of sentinel SNP (rs10103269) in cis on expression of ERICH1 with the p-value from linear regression of expression ~ genotype (n=1,546 biologically independent samples combined from both cohorts). Center line indicates median, lower and upper box limits correspond to the first and third quartiles, respectively; whisker extent indicates 1.5-fold interquartile range; outliers not shown. (c) MAD1L1 locus pathway analysis. Annotations and symbols are as described in (b).
Extended Data Fig. 10 Experimental validation at the ZNF333 locus.
Panel a. Regulatory network of the ZNF333 locus. Annotations and symbols are as described in Extended Figure 9. The boxplot shows the effect of sentinel SNP (rs6511961) in cis on expression of the candidate gene ZNF333 with the p-value from the linear regression of expression ~ genotype (n=1,546 biologically independent samples combined from both cohorts). Panels b-d. HCT116 cells were transfected with ZNF333-FLAG/Myc tagged or GFP-control plasmids in biological replicates. Panel b. Protein lysates were Western blotted for ZNF333 expression using FLAG or MYC antibodies as validations. GAPDH was used as loading control (n=2). Source data: Membranes were cut into three pieces for optimisation of exposure. Top left panel: Original uncropped and unprocessed scans. Top right panel: Scan exposure optimized for molecular ladder. Bottom left panel: Scan exposure optimized for GAPDH. Bottom right panel: Final overlay figure. Panel c. Heatmap showing the Pearson correlation between ChIP-seq performed for ZNF333 using either FLAG or MYC antibodies. Panel d. Motifs of known TFs enriched in ZNF333 binding sites showing perfect overlap between ChIP with FLAG and MYC antibodies.
Supplementary information
Source data
Rights and permissions
About this article
Cite this article
Hawe, J.S., Wilson, R., Schmid, K.T. et al. Genetic variation influencing DNA methylation provides insights into molecular mechanisms regulating genomic function.Nat Genet 54, 18–29 (2022). https://doi.org/10.1038/s41588-021-00969-x
- Received: 22 August 2019
- Accepted: 18 October 2021
- Published: 03 January 2022
- Issue Date: January 2022
- DOI: https://doi.org/10.1038/s41588-021-00969-x