Phenome-wide association study (PheWAS) for detection of pleiotropy within the Population Architecture using Genomics and Epidemiology (PAGE) Network - PubMed (original) (raw)
doi: 10.1371/journal.pgen.1003087. Epub 2013 Jan 31.
Kristin Brown-Gentry, Scott Dudek, Alex Frase, Eric S Torstenson, Robert Goodloe, Jose Luis Ambite, Christy L Avery, Steve Buyske, Petra Bůžková, Ewa Deelman, Megan D Fesinmeyer, Christopher A Haiman, Gerardo Heiss, Lucia A Hindorff, Chu-Nan Hsu, Rebecca D Jackson, Charles Kooperberg, Loic Le Marchand, Yi Lin, Tara C Matise, Kristine R Monroe, Larry Moreland, Sungshim L Park, Alex Reiner, Robert Wallace, Lynn R Wilkens, Dana C Crawford, Marylyn D Ritchie
Affiliations
- PMID: 23382687
- PMCID: PMC3561060
- DOI: 10.1371/journal.pgen.1003087
Phenome-wide association study (PheWAS) for detection of pleiotropy within the Population Architecture using Genomics and Epidemiology (PAGE) Network
Sarah A Pendergrass et al. PLoS Genet. 2013.
Abstract
Using a phenome-wide association study (PheWAS) approach, we comprehensively tested genetic variants for association with phenotypes available for 70,061 study participants in the Population Architecture using Genomics and Epidemiology (PAGE) network. Our aim was to better characterize the genetic architecture of complex traits and identify novel pleiotropic relationships. This PheWAS drew on five population-based studies representing four major racial/ethnic groups (European Americans (EA), African Americans (AA), Hispanics/Mexican-Americans, and Asian/Pacific Islanders) in PAGE, each site with measurements for multiple traits, associated laboratory measures, and intermediate biomarkers. A total of 83 single nucleotide polymorphisms (SNPs) identified by genome-wide association studies (GWAS) were genotyped across two or more PAGE study sites. Comprehensive tests of association, stratified by race/ethnicity, were performed, encompassing 4,706 phenotypes mapped to 105 phenotype-classes, and association results were compared across study sites. A total of 111 PheWAS results had significant associations for two or more PAGE study sites with consistent direction of effect with a significance threshold of p<0.01 for the same racial/ethnic group, SNP, and phenotype-class. Among results identified for SNPs previously associated with phenotypes such as lipid traits, type 2 diabetes, and body mass index, 52 replicated previously published genotype-phenotype associations, 26 represented phenotypes closely related to previously known genotype-phenotype associations, and 33 represented potentially novel genotype-phenotype associations with pleiotropic effects. The majority of the potentially novel results were for single PheWAS phenotype-classes, for example, for CDKN2A/B rs1333049 (previously associated with type 2 diabetes in EA) a PheWAS association was identified for hemoglobin levels in AA. Of note, however, GALNT2 rs2144300 (previously associated with high-density lipoprotein cholesterol levels in EA) had multiple potentially novel PheWAS associations, with hypertension related phenotypes in AA and with serum calcium levels and coronary artery disease phenotypes in EA. PheWAS identifies associations for hypothesis generation and exploration of the genetic architecture of complex traits.
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures
Figure 1. PheWAS associations for rs4420638 near APOC1.
SNP rs4420638 has previously been associated with LDL cholesterol levels, triglycerides, Alzheimer's disease, coronary artery disease, and sporadic late onset Alzheimer's. The length of the lines correspond to –log10(p-value), and the lines are plotted clockwise starting at top for the association with the smallest p-value. Lines are labeled with the study-specific phenotype, the PAGE study, racial/ethnic group, and direction of effect (+ or −). Red lines represent associations at p<0.01. “LN1” indicates the phenotype had 1 added to the variable, and then the variable was natural log transformed. The PheWAS phenotypes significantly associated with this SNP varied, with known associations for LDL cholesterol levels, as well as the related phenotypes “Total cholesterol (mmol/l)” and “Dietary cholesterol (mg)”, and novel phenotypes such as “Baseline glucose (mg/dl)”.
Figure 2. PheWAS associations for rs10757278 near CDKN2A/CDKN2B.
SNP rs10757278 was previously associated with myocardial infarction (MI). Associations are plotted clockwise starting at top for the association with the smallest p-value and the length of the line corresponds to –log10(p-value). Lines are labeled with the study-specific phenotype, the PAGE study, racial/ethnic group, and direction of effect (+ or −). Red lines represent associations at p<0.01, and results with p<0.05 are also plotted in grey to show trends for additional phenotypes. “LN1” indicates the phenotype had 1 added to the variable, and then the variable was natural log transformed. The PheWAS phenotypes significantly associated with this SNP varied, from MI (known), to coronary artery disease and MI related phenotypes such as presence or absence of “percutaneous transluminal coronary angioplasty”, “angina”, and “coronary bypass surgery”.
Figure 3. PheWAS associations for rs599839 near CELSR2/PSRC1.
This SNP has previously published associations with serum LDL cholesterol levels, total cholesterol, and coronary artery disease. Genotype-phenotype associations are plotted clockwise starting at top for the association with the smallest p-value. The length of the line corresponds to –log10(p-value), the longer the line the more significant the result. The study race/ethnicity/and phenotype for each tests of association are listed. Red lines represent associations at p<0.01, and results with p<0.05 are also plotted in grey to show trends for additional phenotypes. “LN1” indicates the phenotype had 1 added to the variable, and then the variable was natural log transformed. The PheWAS phenotypes significantly associated with this SNP varied, from LDL cholesterol levels (previously published), to lipid level-related phenotypes such as “High cholesterol requiring pills ever”. In the case of coronary artery disease, phenotypes with significant results that were related to coronary artery disease included “Ever had pain/discomfort in your chest”, and “Hospitalized for chest pain”.
Figure 4. PheWAS results for blood cell counts and hemoglobin levels.
Eleven novel genotype-phenotype-class associations were identified for white blood cell counts and hemoglobin levels collectively. The top track indicates the chromosomal location of each SNP, below that track is a SNP/Phenotype identification track containing the SNP ID, as well as the phenotype, phenotype transformation if present (LN1 = ln(1+variable)), and the race-ethnicity for the test population (AA or EA). The next track is a “presence/absence” track, box presence indicates if the SNP was present for ARIC (blue), CHS (red), WHI (orange), or EAGLE (purple). The next tracks are as follows: –log10(p-value), where the each p-value is plotted, the direction of the triangle indicates the direction of effect (triangle pointed up is positive, triangle pointed down is negative), base of the triangle corresponds to the location of the p-value, solid red line is positioned at p-value = 0.01; The next track is magnitude of effect (beta) dotted grey line is positioned at the null; Next are coded allele frequencies (CAF) for each study; Final track is sample size for each test of association.
Figure 5. PheWAS associations for rs2144300 within GALNT2.
The previously published associations for this SNP were with triglyceride and HDL cholesterol levels. Genotype-phenotype associations are plotted clockwise starting at top for the association with the smallest p-value. The length of the line corresponds to –log10(p-value), the longer the line the more significant the result. The study race/ethnicity/and phenotype for each tests of association are listed. Red lines represent associations at p<0.01, and results with p<0.05 are also plotted in grey to show trends for additional phenotypes. The novel PheWAS phenotypes significantly associated with this SNP varied, including white blood cell counts, forced vital capacity at three seconds (FEV3), and serum calcium levels.
Figure 6. Workflow for phenotype matching, to develop the 105 phenotype classes.
A MySQL database was used to filter the data from five studies for any results with p<0.01 to generate lists of the unique phenotypes for each individual PAGE study. The number of phenotypes that passed this significance threshold for each of the four groups was 604 (ARIC), 331 (CHS), 63 (MEC), 324 (EAGLE), 1,342 (WHI). Note that during the binning process, a smaller number of phenotypes are listed in Figure 6 than the total number of phenotypes referred to in the manuscript for the actual associations, in the phenotype matching process we only took into account distinct phenotypes regardless of whether or not they were transformed or untransformed or if they were categorical phenotypes binned into case/control phenotypes. Next, resulting phenotypes were then manually matched up between ARIC, CHS, MEC, EAGLE and WHI using and knowledge about the phenotypes and the known focus of specific PAGE study questions (such as arterial measurements including degree of arterial stenosis). In the last step, phenotypes from all studies, regardless of significance from genotype-phenotype tests of association, were matched to the already-defined phenotype classes using the criteria described above.
References
- Collins FS (2004) The case for a US prospective cohort study of genes and environment. Nature 429: 475–477. - PubMed
- Collins FS, Manolio TA (2007) Merging and emerging cohorts: necessary but not sufficient. Nature 445: 259. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
- R00 HL098458/HL/NHLBI NIH HHS/United States
- U01 HL041642/HL/NHLBI NIH HHS/United States
- U01 HL041654/HL/NHLBI NIH HHS/United States
- U01 HG004790/HG/NHGRI NIH HHS/United States
- N01 HC048049/HL/NHLBI NIH HHS/United States
- N01 HC045205/HL/NHLBI NIH HHS/United States
- N01 HC055020/HL/NHLBI NIH HHS/United States
- U01 HG004798/HG/NHGRI NIH HHS/United States
- U01 HL065520/HL/NHLBI NIH HHS/United States
- U01 HG004801/HG/NHGRI NIH HHS/United States
- N01 HC095095/HL/NHLBI NIH HHS/United States
- N01 HC055016/HL/NHLBI NIH HHS/United States
- N01 HC055019/HL/NHLBI NIH HHS/United States
- N01 HC048048/HL/NHLBI NIH HHS/United States
- N01-HV-48195/HV/NHLBI NIH HHS/United States
- R01 AG015928/AG/NIA NIH HHS/United States
- U01 HL080295/HL/NHLBI NIH HHS/United States
- U01 HG004802/HG/NHGRI NIH HHS/United States
- N01 HC055021/HL/NHLBI NIH HHS/United States
- N01 HC015103/HC/NHLBI NIH HHS/United States
- N01 HC085086/HL/NHLBI NIH HHS/United States
- R56 AG020098/AG/NIA NIH HHS/United States
- N01 WH022110/WH/WHI NIH HHS/United States
- N01 HC055015/HL/NHLBI NIH HHS/United States
- U01HG004803/HG/NHGRI NIH HHS/United States
- U01 HL041652/HL/NHLBI NIH HHS/United States
- U01HG004802/HG/NHGRI NIH HHS/United States
- HHSN268201200036C/HL/NHLBI NIH HHS/United States
- P30 ES007033/ES/NIEHS NIH HHS/United States
- N01 HC055222/HL/NHLBI NIH HHS/United States
- U01 HG004803/HG/NHGRI NIH HHS/United States
- P01 CA033619/CA/NCI NIH HHS/United States
- U01HG004798-01/HG/NHGRI NIH HHS/United States
- N01 HC045134/HC/NHLBI NIH HHS/United States
- N01 HC005187/HL/NHLBI NIH HHS/United States
- N01 HC085079/HL/NHLBI NIH HHS/United States
- U01HG004798/HG/NHGRI NIH HHS/United States
- R01 HL080295/HL/NHLBI NIH HHS/United States
- N01 HC048047/HL/NHLBI NIH HHS/United States
- N01 HV048195/HL/NHLBI NIH HHS/United States
- N01 HC055018/HL/NHLBI NIH HHS/United States
- R01 AG020098/AG/NIA NIH HHS/United States
- U01HG004801/HG/NHGRI NIH HHS/United States
- N01 HC048050/HL/NHLBI NIH HHS/United States
- U01 CA098758/CA/NCI NIH HHS/United States
- U01 HL065521/HL/NHLBI NIH HHS/United States
- N01 HC055022/HL/NHLBI NIH HHS/United States
- U01HG004790/HG/NHGRI NIH HHS/United States
- N01 HC075150/HL/NHLBI NIH HHS/United States
- R01 AG023629/AG/NIA NIH HHS/United States
- R01 AG027058/AG/NIA NIH HHS/United States
- N01 HC045133/HC/NHLBI NIH HHS/United States
- U01 CA136792/CA/NCI NIH HHS/United States
- P30 CA071789/CA/NCI NIH HHS/United States
- R37 CA054281/CA/NCI NIH HHS/United States
- N01 HC035129/HC/NHLBI NIH HHS/United States
- R56 AG023629/AG/NIA NIH HHS/United States
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
Miscellaneous