Habibul Ahsan - Academia.edu (original) (raw)
Papers by Habibul Ahsan
Human Genetics
Leukocyte telomere length (LTL) is a heritable trait with two potential sources of heritability (... more Leukocyte telomere length (LTL) is a heritable trait with two potential sources of heritability (h): inherited variation in non-telomeric regions (e.g., SNPs that influence telomere maintenance) and variability in the lengths of telomeres in gametes that produce offspring zygotes (i.e., "direct" inheritance). Prior studies of LTL h have not attempted to disentangle these two sources. Here, we use a novel approach for detecting the direct inheritance of telomeres by studying the association between identity-by-descent (IBD) sharing at chromosome ends and phenotypic similarity in LTL. We measured genome-wide SNPs and LTL for a sample of 5069 Bangladeshi adults with substantial relatedness. For each of the 6318 relative pairs identified, we used SNPs near the telomeres to estimate the number of chromosome ends shared IBD, a proxy for the number of telomeres shared IBD (T). We then estimated the association between T and the squared pairwise difference in LTL ((ΔLTL)) within various classes of relatives (siblings, avuncular, cousins, and distant), adjusting for overall genetic relatedness (ϕ). The association between T and (ΔLTL) was inverse among all relative pair types. In a meta-analysis including all relative pairs (ϕ > 0.05), the association between T and (ΔLTL) (P = 0.01) was stronger than the association between ϕ and (ΔLTL) (P = 0.43). Our results provide strong evidence that telomere length (TL) in parental germ cells impacts TL in offspring cells and contributes to LTL h despite telomere "reprogramming" during embryonic development. Applying our method to larger studies will enable robust estimation of LTL h attributable to direct transmission of telomeres.
Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology, Jan 27, 2018
A number of cohort studies have collected Scope mouthwash samples by mail which are being used fo... more A number of cohort studies have collected Scope mouthwash samples by mail which are being used for microbiota measurements. We evaluated the stability of Scope mouthwash samples at ambient temperature and determined the comparability of Scope mouthwash with saliva collection using the OMNIgene ORAL kit. Fifty-three healthy volunteers from Mayo Clinic and fifty cohort members from Bangladesh provided oral samples. One aliquot of the OMNIgene ORAL and Scope mouthwash were frozen immediately and one aliquot of the Scope mouthwash remained at ambient temperature for four days and then was frozen. DNA was extracted and the V4 region of the 16S rRNA gene was PCR amplified and sequenced using the HiSeq. Intraclass correlation coefficients (ICC) were calculated. The overall stability of the Scope mouthwash samples was relatively high for alpha and beta diversity. For example, the meta-analyzed ICC for the Shannon Index was 0.86 (95% CI: 0.76, 0.96). Similarly, the ICCs for the relative abun...
Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology, Jan 13, 2018
Although germline genetics influences breast cancer incidence, published research only explains a... more Although germline genetics influences breast cancer incidence, published research only explains approximately half of the expected association. Moreover, the accuracy of prediction models remains low. For women who develop breast cancer early, the genetic architecture is less established. To identify loci associated with early-onset breast cancer, gene-based tests were carried out using exome array data from 3,479 women with breast cancer diagnosed before age 50 and 973 age-matched controls. Replication was undertaken in a population that developed breast cancer at all ages of onset. Three gene regions were associated with breast cancer incidence: ( = 1.23 × 10; replication < 1.00 × 10), ( = 3.57 × 10; replication < 1.00 × 10), and ( = 5.49 × 10; replication < 1.00 × 10). Of the 151 gene regions reported in previous literature, 19 (12.5%) showed evidence of association ( < 0.05) with the risk of early-onset breast cancer in the early-onset population. To predict incidenc...
PloS one, 2018
Clustering of breast and colorectal cancer has been observed within some families and cannot be e... more Clustering of breast and colorectal cancer has been observed within some families and cannot be explained by chance or known high-risk mutations in major susceptibility genes. Potential shared genetic susceptibility between breast and colorectal cancer, not explained by high-penetrance genes, has been postulated. We hypothesized that yet undiscovered genetic variants predispose to a breast-colorectal cancer phenotype. To identify variants associated with a breast-colorectal cancer phenotype, we analyzed genome-wide association study (GWAS) data from cases and controls that met the following criteria: cases (n = 985) were women with breast cancer who had one or more first- or second-degree relatives with colorectal cancer, men/women with colorectal cancer who had one or more first- or second-degree relatives with breast cancer, and women diagnosed with both breast and colorectal cancer. Controls (n = 1769), were unrelated, breast and colorectal cancer-free, and age- and sex- frequenc...
Nature communications, Feb 23, 2018
Inherited genetic variation affects local gene expression and DNA methylation in humans. Most exp... more Inherited genetic variation affects local gene expression and DNA methylation in humans. Most expression quantitative trait loci (cis-eQTLs) occur at the same genomic location as a methylation QTL (cis-meQTL), suggesting a common causal variant and shared mechanism. Using DNA and RNA from peripheral blood of Bangladeshi individuals, here we use co-localization methods to identify eQTL-meQTL pairs likely to share a causal variant. We use partial correlation and mediation analyses to identify >400 of these pairs showing evidence of a causal relationship between expression and methylation (i.e., shared mechanism) with many additional pairs we are underpowered to detect. These co-localized pairs are enriched for SNPs showing opposite associations with expression and methylation, although many SNPs affect multiple CpGs in opposite directions. This work demonstrates the pervasiveness of co-regulated expression and methylation in the human genome. Applying this approach to other types o...
Environmental health perspectives, Jan 12, 2018
Chronic exposure to inorganic arsenic from drinking water has been associated with a host of canc... more Chronic exposure to inorganic arsenic from drinking water has been associated with a host of cancer and noncancer diseases. The application of metabolomics in epidemiologic studies may allow researchers to identify biomarkers associated with arsenic exposure and its health effects. Our goal was to evaluate the long-term reproducibility of urinary metabolites and associations between reproducible metabolites and arsenic exposure. We studied samples and data from 112 nonsmoking participants (58 men and 54 women) who were free of any major chronic diseases and who were enrolled in the Health Effects of Arsenic Longitudinal Study (HEALS), a large prospective cohort study in Bangladesh. Using a global gas chromatography-mass spectrometry platform, we measured metabolites in their urine samples, which were collected at baseline and again 2 y apart, and estimated intraclass correlation coefficients (ICCs). Linear regression was used to assess the association between arsenic exposure at bas...
Journal of medical genetics, Jan 18, 2017
Leucocyte telomere length (TL) is a potential biomarker of ageing and risk for age-related diseas... more Leucocyte telomere length (TL) is a potential biomarker of ageing and risk for age-related disease. Leucocyte TL is heritable and shows substantial differences by race/ethnicity. Recent genome-wide association studies (GWAS) report ~10 loci harbouring SNPs associated with leucocyte TL, but these studies focus primarily on populations of European ancestry. This study aims to enhance our understanding of genetic determinants of TL across populations. We performed a GWAS of TL using data on 5075 Bangladeshi adults. We measured TL using one of two technologies (qPCR or a Luminex-based method) and used standardised variables as TL phenotypes. Our results replicate previously reported associations in the TERC and TERT regions (P=2.2×10(-8) and P=6.4×10(-6), respectively). We observed a novel association signal in the RTEL1 gene (intronic SNP rs2297439; P=2.82×10(-7)) that is independent of previously reported TL-associated SNPs in this region. The minor allele for rs2297439 is common in S...
Nature, Nov 2, 2017
Breast cancer risk is influenced by rare coding variants in susceptibility genes, such as BRCA1, ... more Breast cancer risk is influenced by rare coding variants in susceptibility genes, such as BRCA1, and many common, mostly non-coding variants. However, much of the genetic contribution to breast cancer risk remains unknown. Here we report the results of a genome-wide association study of breast cancer in 122,977 cases and 105,974 controls of European ancestry and 14,068 cases and 13,104 controls of East Asian ancestry. We identified 65 new loci that are associated with overall breast cancer risk at P < 5 × 10(-8). The majority of credible risk single-nucleotide polymorphisms in these loci fall in distal regulatory elements, and by integrating in silico data to predict target genes in breast cells at each locus, we demonstrate a strong overlap between candidate target genes and somatic driver genes in breast tumours. We also find that heritability of breast cancer due to all single-nucleotide polymorphisms in regulatory features was 2-5-fold enriched relative to the genome-wide ave...
Nature genetics, Jan 23, 2017
Most common breast cancer susceptibility variants have been identified through genome-wide associ... more Most common breast cancer susceptibility variants have been identified through genome-wide association studies (GWAS) of predominantly estrogen receptor (ER)-positive disease. We conducted a GWAS using 21,468 ER-negative cases and 100,594 controls combined with 18,908 BRCA1 mutation carriers (9,414 with breast cancer), all of European origin. We identified independent associations at P < 5 × 10(-8) with ten variants at nine new loci. At P…
Applied and environmental microbiology, May 3, 2017
To our knowledge, fecal microbiota collection methods have not been evaluated in low and middle i... more To our knowledge, fecal microbiota collection methods have not been evaluated in low and middle income countries. Therefore, we evaluated five different fecal sample collections methods for technical reproducibility, stability, and accuracy within the Health Effects of Arsenic Longitudinal Study (HEALS) in Bangladesh. Fifty participants from HEALS provided fecal samples in the clinic which were aliquoted into no solution, 95% ethanol, RNAlater, post-development fecal occult blood test (FOBT) cards, and fecal immunochemical test (FIT) tubes. Half of the aliquots were frozen immediately at -80°C (Day 0) and the remaining samples were left at ambient temperature for 96 hours and then frozen (Day 4). Intraclass correlation coefficients (ICC) were calculated for the relative abundance of the top three phyla, two alpha diversity measures, and four beta diversity measures. The duplicate samples had relatively high ICCs for technical reproducibility at Day 0 and Day 4 (range: 0.79-0.99). Th...
PLOS ONE, 2016
Telomere length is a potential biomarker of aging and risk for age-related diseases. For measurem... more Telomere length is a potential biomarker of aging and risk for age-related diseases. For measurement of relative telomere repeat mass (TRM), qPCR is typically used primarily due to its low cost and low DNA input. But the position of the sample on a plate often impacts the qPCR-based TRM measurement. Recently we developed a novel, probe-based Luminex assay for TRM that requires~50ng DNA and involves no DNA amplification. Here we report, for the first time, a comparison among TRM measurements obtained from (a) two singleplex qPCR assays (using two different primer sets), (b) a multiplex qPCR assay, and (c) our novel Luminex assay. Our comparison is focused on characterizing the effects of sample positioning on TRM measurement. For qPCR, DNA samples from two individuals (K and F) were placed in 48 wells of a 96-well plate. For each singleplex qPCR assay, we used two plates (one for Telomere and one for Reference gene). For the multiplex qPCR and the Luminex assay, the telomere and the reference genes were assayed from the same well. The coefficient of variation (CV) of the TRM for Luminex (7.2 to 8.4%) was consistently lower than singleplex qPCR (11.4 to 14.9%) and multiplex qPCR (19.7 to 24.3%). In all three qPCR assays the DNA samples in the left-and right-most columns showed significantly lower TRM than the samples towards the center, which was not the case for the Luminex assay (p = 0.83). For singleplex qPCR, 30.5% of the variation in TL was explained by column-to-column variation and 0.82 to 27.9% was explained by sample-to-sample variation. In contrast, only 5.8% of the variation in TRM for the Luminex assay was explained by column-to column variation and 50.4% was explained by sample-to-sample variation. Our novel Luminex assay for TRM had good precision and did not show the well position effects of the sample that were seen in all three of the qPCR assays that were tested.
Cancer research, Sep 20, 2016
Identifying genetic variants with pleiotropic associations can uncover common pathways influencin... more Identifying genetic variants with pleiotropic associations can uncover common pathways influencing multiple cancers. We took a two-staged approach to conduct genome-wide association studies for lung, ovary, breast, prostate and colorectal cancer from the GAME-ON/GECCO Network (61,851 cases, 61,820 controls) to identify pleiotropic loci. Findings were replicated in independent association studies (55,789 cases, 330,490 controls). We identified a novel pleiotropic association at 1q22 involving breast and lung squamous cell carcinoma, with eQTL analysis showing an association with ADAM15/THBS3 gene expression in lung. We also identified a known breast cancer locus CASP8/ALS2CR12 associated with prostate cancer, a known cancer locus at CDKN2B-AS1 with different variants associated with lung adenocarcinoma and prostate cancer and confirmed the associations of a breast BRCA2 locus with lung and serous ovarian cancer. This is the largest study to date examining pleiotropy across multiple c...
Clinical Epigenetics, 2016
Background: We examined whether differences in tumor DNA methylation were associated with more ag... more Background: We examined whether differences in tumor DNA methylation were associated with more aggressive hormone receptor-negative breast cancer in an ethnically diverse group of patients in the Breast Cancer Care in Chicago (BCCC) study and using data from The Cancer Genome Atlas (TCGA). Results: DNA was extracted from formalin-fixed, paraffin-embedded samples on 75 patients (21 White, 31 African-American, and 23 Hispanic) (training dataset) enrolled in the BCCC. Hormone receptor status was defined as negative if tumors were negative for both estrogen and progesterone (ER/PR) receptors (N = 22/75). DNA methylation was analyzed at 1505 CpG sites within 807 gene promoters using the Illumina GoldenGate assay. Differential DNA methylation as a predictor of hormone receptor status was tested while controlling for false discovery rate and assigned to the gene closest to the respective CpG site. Next, those genes that predicted ER/ PR status were validated using TCGA data with respect to DNA methylation (validation dataset), and correlations between CpG methylation and gene expression were examined. In the training dataset, 5.7 % of promoter mean methylation values (46/807) were associated with receptor status at P < 0.05; for 88 % of these (38/46), hypermethylation was associated with receptor-positive disease. Hypermethylation for FZD9, MME, BCAP31, HDAC9, PAX6, SCGB3A1, PDGFRA, IGFBP3, and PTGS2 genes most strongly predicted receptor-positive disease. Twenty-one of 24 predictor genes from the training dataset were confirmed in the validation dataset. The level of DNA methylation at 19 out 22 genes, for which gene expression data were available, was associated with gene activity. Conclusions: Higher levels of promoter methylation strongly correlate with hormone receptor positive status of breast tumors. For most of the genes identified in our training dataset as ER/PR receptor status predictors, DNA methylation correlated with stable gene expression level. The predictors performed well when evaluated on independent set of samples, with different racioethnic distribution, thus providing evidence that this set of DNA methylation biomarkers will likely generalize to prospective patient samples.
Nature genetics, Nov 21, 2015
We carried out a trans-ancestry genome-wide association and replication study of blood pressure p... more We carried out a trans-ancestry genome-wide association and replication study of blood pressure phenotypes among up to 320,251 individuals of East Asian, European and South Asian ancestry. We find genetic variants at 12 new loci to be associated with blood pressure (P = 3.9 × 10(-11) to 5.0 × 10(-21)). The sentinel blood pressure SNPs are enriched for association with DNA methylation at multiple nearby CpG sites, suggesting that, at some of the loci identified, DNA methylation may lie on the regulatory pathway linking sequence variation to blood pressure. The sentinel SNPs at the 12 new loci point to genes involved in vascular smooth muscle (IGFBP3, KCNK3, PDE3A and PRDM6) and renal (ARHGAP24, OSR1, SLC22A7 and TBX2) function. The new and known genetic variants predict increased left ventricular mass, circulating levels of NT-proBNP, and cardiovascular and all-cause mortality (P = 0.04 to 8.6 × 10(-6)). Our results provide new evidence for the role of DNA methylation in blood pressu...
Human molecular genetics, Jan 2, 2015
Epidemiological studies have reported inconsistent associations between telomere length (TL) and ... more Epidemiological studies have reported inconsistent associations between telomere length (TL) and risk for various cancers. These inconsistencies are likely attributable, in part, to biases that arise due to post-diagnostic and post-treatment TL measurement. To avoid such biases, we used a Mendelian randomization approach and estimated associations between nine TL-associated SNPs and risk for five common cancer types (breast, lung, colorectal, ovarian and prostate cancer, including subtypes) using data on 51,725 cases and 62,035 controls. We then used an inverse-variance weighted average of the SNP-specific associations to estimate the association between a genetic score representing long TL and cancer risk. The long TL genetic score was significantly associated with increased risk of lung adenocarcinoma (P=6.3x10(-15)), even after exclusion of a SNP residing in a known lung cancer susceptibility region (TERT-CLPTM1L) P=6.6x10(-6)). Under Mendelian randomization assumptions, the asso...
Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology, 2014
Telomeres are tandem repeats of sequences present at the end of the chromosomes that maintain chr... more Telomeres are tandem repeats of sequences present at the end of the chromosomes that maintain chromosomal integrity. After repeated cell division, telomeres shorten to a critical level, triggering replicative senescence or apoptosis, which is a key determinant of cellular aging. Short telomeres also contribute to genome instability and are a hallmark of many cancers. There are several methods for estimating telomere length (TL) from extracted DNA samples. Southern blot is accurate but requires a large quantity of DNA and is expensive. qPCR is cost-effective and requires a small quantity of DNA and is therefore widely used for large-scale epidemiologic studies; however, it typically requires triplicates. We describe a novel multiplexed probe-based non-PCR method for TL measurement. A small amount of DNA (∼50 ng) is hybridized to telomere repeat sequence-specific probes (T) and a reference single gene probes (R). T and R signals are detected from a single reaction well containing the ...
Cancer Prevention Research, 2015
Exposure to toxicants leads to cumulative molecular changes that overtime increase a subject's ri... more Exposure to toxicants leads to cumulative molecular changes that overtime increase a subject's risk of developing urothelial carcinoma. To assess the impact of arsenic exposure at a time progressive manner, we developed and characterized a cell culture model and tested a panel of miRNAs in urine samples from arsenic-exposed subjects, urothelial carcinoma patients, and controls. To prepare an in vitro model, we chronically exposed an immortalized normal human bladder cell line (HUC1) to arsenic. Growth of the HUC1 cells was increased in a time-dependent manner after arsenic treatment and cellular morphology was changed. In a soft agar assay, colonies were observed only in arsenic-treated cells, and the number of colonies gradually increased with longer periods of treatment. Similarly, invaded cells in an invasion assay were observed only in arsenic-treated cells. Withdrawal of arsenic treatment for 2.5 months did not reverse the tumorigenic properties of arsenic-treated cells. Western blot analysis demonstrated decreased PTEN and increased AKT and mTOR in arsenic-treated HUC1 cells. Levels of miR-200a, miR-200b, and miR-200c were downregulated in arsenic-exposed HUC1 cells by quantitative RT-PCR. Furthermore, in human urine, miR-200c and miR-205 were inversely associated with arsenic exposure (P ¼ 0.005 and 0.009, respectively). Expression of miR-205 discriminated cancer cases from controls with high sensitivity and specificity (AUC ¼ 0.845). Our study suggests that exposure to arsenic rapidly induces a multifaceted dedifferentiation program and miR-205 has potential to be used as a marker of arsenic exposure as well as a maker of early urothelial carcinoma detection. Cancer Prev Res; 8(3); 208-21. Ó2015 AACR.
PLoS Genetics, 2014
A large fraction of human genes are regulated by genetic variation near the transcribed sequence ... more A large fraction of human genes are regulated by genetic variation near the transcribed sequence (cis-eQTL, expression quantitative trait locus), and many cis-eQTLs have implications for human disease. Less is known regarding the effects of genetic variation on expression of distant genes (trans-eQTLs) and their biological mechanisms. In this work, we use genome-wide data on SNPs and array-based expression measures from mononuclear cells obtained from a populationbased cohort of 1,799 Bangladeshi individuals to characterize cis-and trans-eQTLs and determine if observed trans-eQTL associations are mediated by expression of transcripts in cis with the SNPs showing trans-association, using Sobel tests of mediation. We observed 434 independent trans-eQTL associations at a false-discovery rate of 0.05, and 189 of these trans-eQTLs were also cis-eQTLs (enrichment P,0.0001). Among these 189 trans-eQTL associations, 39 were significantly attenuated after adjusting for a cis-mediator based on Sobel P,10-5. We attempted to replicate 21 of these mediation signals in two European cohorts, and while only 7 trans-eQTL associations were present in one or both cohorts, 6 showed evidence of cis-mediation. Analyses of simulated data show that complete mediation will be observed as partial mediation in the presence of mediator measurement error or imperfect LD between measured and causal variants. Our data demonstrates that trans-associations can become significantly stronger or switch directions after adjusting for a potential mediator. Using simulated data, we demonstrate that this phenomenon is expected in the presence of strong cis-trans confounding and when the measured cis-transcript is correlated with the true (unmeasured) mediator. In conclusion, by applying mediation analysis to eQTL data, we show that a substantial fraction of observed trans-eQTL associations can be explained by cis-mediation. Future studies should focus on understanding the mechanisms underlying widespread cismediation and their relevance to disease biology, as well as using mediation analysis to improve eQTL discovery.
Toxicology and Applied Pharmacology, 2014
Epidemiologic studies that evaluated genetic susceptibility for the effects of arsenic exposure f... more Epidemiologic studies that evaluated genetic susceptibility for the effects of arsenic exposure from drinking water on subclinical atherosclerosis are limited. We conducted a cross-sectional study of 1078 participants randomly selected from the Health Effects of Arsenic Longitudinal Study in Bangladesh to evaluate whether the association between arsenic exposure and carotid artery intima-media thickness (cIMT) differs by 207 single-nucleotide polymorphisms (SNPs) in 18 genes related to arsenic metabolism, oxidative stress, inflammation, and endothelial dysfunction. Although not statistically significant after correcting for multiple testing, nine SNPs in APOE, AS3MT, PNP, and TNF genes had a nominally statistically significant interaction with well-water arsenic in cIMT. For instance, the joint presence of a higher level of well-water arsenic (≥40.4 μg/L) and the GG genotype of AS3MT rs3740392 was associated with a difference of 40.9 μm (95% CI = 14.4, 67.5) in cIMT, much greater than the difference of cIMT associated with the genotype alone (β = −5.1 μm, 95% CI = −31.6, 21.3) or arsenic exposure alone (β = 7.2 μm, 95% CI = −3.1, 17.5). The pattern and magnitude of the interactions were similar when urinary arsenic was used as the exposure variable. Additionally, the at-risk genotypes of the AS3MT SNPs were positively related to the proportion of monomethylarsonic acid (MMA) in urine, which is indicative of arsenic methylation capacity. The findings provide novel evidence that genetic variants related to arsenic metabolism may play an important role in arsenic-induced subclinical atherosclerosis. Future replication studies in diverse populations are needed to confirm the findings.
Environmental Health Perspectives, 2014
We thank D. Koestler for assistance and helpful discussion regarding the cell type deconvolution ... more We thank D. Koestler for assistance and helpful discussion regarding the cell type deconvolution analyses.
Human Genetics
Leukocyte telomere length (LTL) is a heritable trait with two potential sources of heritability (... more Leukocyte telomere length (LTL) is a heritable trait with two potential sources of heritability (h): inherited variation in non-telomeric regions (e.g., SNPs that influence telomere maintenance) and variability in the lengths of telomeres in gametes that produce offspring zygotes (i.e., "direct" inheritance). Prior studies of LTL h have not attempted to disentangle these two sources. Here, we use a novel approach for detecting the direct inheritance of telomeres by studying the association between identity-by-descent (IBD) sharing at chromosome ends and phenotypic similarity in LTL. We measured genome-wide SNPs and LTL for a sample of 5069 Bangladeshi adults with substantial relatedness. For each of the 6318 relative pairs identified, we used SNPs near the telomeres to estimate the number of chromosome ends shared IBD, a proxy for the number of telomeres shared IBD (T). We then estimated the association between T and the squared pairwise difference in LTL ((ΔLTL)) within various classes of relatives (siblings, avuncular, cousins, and distant), adjusting for overall genetic relatedness (ϕ). The association between T and (ΔLTL) was inverse among all relative pair types. In a meta-analysis including all relative pairs (ϕ > 0.05), the association between T and (ΔLTL) (P = 0.01) was stronger than the association between ϕ and (ΔLTL) (P = 0.43). Our results provide strong evidence that telomere length (TL) in parental germ cells impacts TL in offspring cells and contributes to LTL h despite telomere "reprogramming" during embryonic development. Applying our method to larger studies will enable robust estimation of LTL h attributable to direct transmission of telomeres.
Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology, Jan 27, 2018
A number of cohort studies have collected Scope mouthwash samples by mail which are being used fo... more A number of cohort studies have collected Scope mouthwash samples by mail which are being used for microbiota measurements. We evaluated the stability of Scope mouthwash samples at ambient temperature and determined the comparability of Scope mouthwash with saliva collection using the OMNIgene ORAL kit. Fifty-three healthy volunteers from Mayo Clinic and fifty cohort members from Bangladesh provided oral samples. One aliquot of the OMNIgene ORAL and Scope mouthwash were frozen immediately and one aliquot of the Scope mouthwash remained at ambient temperature for four days and then was frozen. DNA was extracted and the V4 region of the 16S rRNA gene was PCR amplified and sequenced using the HiSeq. Intraclass correlation coefficients (ICC) were calculated. The overall stability of the Scope mouthwash samples was relatively high for alpha and beta diversity. For example, the meta-analyzed ICC for the Shannon Index was 0.86 (95% CI: 0.76, 0.96). Similarly, the ICCs for the relative abun...
Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology, Jan 13, 2018
Although germline genetics influences breast cancer incidence, published research only explains a... more Although germline genetics influences breast cancer incidence, published research only explains approximately half of the expected association. Moreover, the accuracy of prediction models remains low. For women who develop breast cancer early, the genetic architecture is less established. To identify loci associated with early-onset breast cancer, gene-based tests were carried out using exome array data from 3,479 women with breast cancer diagnosed before age 50 and 973 age-matched controls. Replication was undertaken in a population that developed breast cancer at all ages of onset. Three gene regions were associated with breast cancer incidence: ( = 1.23 × 10; replication < 1.00 × 10), ( = 3.57 × 10; replication < 1.00 × 10), and ( = 5.49 × 10; replication < 1.00 × 10). Of the 151 gene regions reported in previous literature, 19 (12.5%) showed evidence of association ( < 0.05) with the risk of early-onset breast cancer in the early-onset population. To predict incidenc...
PloS one, 2018
Clustering of breast and colorectal cancer has been observed within some families and cannot be e... more Clustering of breast and colorectal cancer has been observed within some families and cannot be explained by chance or known high-risk mutations in major susceptibility genes. Potential shared genetic susceptibility between breast and colorectal cancer, not explained by high-penetrance genes, has been postulated. We hypothesized that yet undiscovered genetic variants predispose to a breast-colorectal cancer phenotype. To identify variants associated with a breast-colorectal cancer phenotype, we analyzed genome-wide association study (GWAS) data from cases and controls that met the following criteria: cases (n = 985) were women with breast cancer who had one or more first- or second-degree relatives with colorectal cancer, men/women with colorectal cancer who had one or more first- or second-degree relatives with breast cancer, and women diagnosed with both breast and colorectal cancer. Controls (n = 1769), were unrelated, breast and colorectal cancer-free, and age- and sex- frequenc...
Nature communications, Feb 23, 2018
Inherited genetic variation affects local gene expression and DNA methylation in humans. Most exp... more Inherited genetic variation affects local gene expression and DNA methylation in humans. Most expression quantitative trait loci (cis-eQTLs) occur at the same genomic location as a methylation QTL (cis-meQTL), suggesting a common causal variant and shared mechanism. Using DNA and RNA from peripheral blood of Bangladeshi individuals, here we use co-localization methods to identify eQTL-meQTL pairs likely to share a causal variant. We use partial correlation and mediation analyses to identify >400 of these pairs showing evidence of a causal relationship between expression and methylation (i.e., shared mechanism) with many additional pairs we are underpowered to detect. These co-localized pairs are enriched for SNPs showing opposite associations with expression and methylation, although many SNPs affect multiple CpGs in opposite directions. This work demonstrates the pervasiveness of co-regulated expression and methylation in the human genome. Applying this approach to other types o...
Environmental health perspectives, Jan 12, 2018
Chronic exposure to inorganic arsenic from drinking water has been associated with a host of canc... more Chronic exposure to inorganic arsenic from drinking water has been associated with a host of cancer and noncancer diseases. The application of metabolomics in epidemiologic studies may allow researchers to identify biomarkers associated with arsenic exposure and its health effects. Our goal was to evaluate the long-term reproducibility of urinary metabolites and associations between reproducible metabolites and arsenic exposure. We studied samples and data from 112 nonsmoking participants (58 men and 54 women) who were free of any major chronic diseases and who were enrolled in the Health Effects of Arsenic Longitudinal Study (HEALS), a large prospective cohort study in Bangladesh. Using a global gas chromatography-mass spectrometry platform, we measured metabolites in their urine samples, which were collected at baseline and again 2 y apart, and estimated intraclass correlation coefficients (ICCs). Linear regression was used to assess the association between arsenic exposure at bas...
Journal of medical genetics, Jan 18, 2017
Leucocyte telomere length (TL) is a potential biomarker of ageing and risk for age-related diseas... more Leucocyte telomere length (TL) is a potential biomarker of ageing and risk for age-related disease. Leucocyte TL is heritable and shows substantial differences by race/ethnicity. Recent genome-wide association studies (GWAS) report ~10 loci harbouring SNPs associated with leucocyte TL, but these studies focus primarily on populations of European ancestry. This study aims to enhance our understanding of genetic determinants of TL across populations. We performed a GWAS of TL using data on 5075 Bangladeshi adults. We measured TL using one of two technologies (qPCR or a Luminex-based method) and used standardised variables as TL phenotypes. Our results replicate previously reported associations in the TERC and TERT regions (P=2.2×10(-8) and P=6.4×10(-6), respectively). We observed a novel association signal in the RTEL1 gene (intronic SNP rs2297439; P=2.82×10(-7)) that is independent of previously reported TL-associated SNPs in this region. The minor allele for rs2297439 is common in S...
Nature, Nov 2, 2017
Breast cancer risk is influenced by rare coding variants in susceptibility genes, such as BRCA1, ... more Breast cancer risk is influenced by rare coding variants in susceptibility genes, such as BRCA1, and many common, mostly non-coding variants. However, much of the genetic contribution to breast cancer risk remains unknown. Here we report the results of a genome-wide association study of breast cancer in 122,977 cases and 105,974 controls of European ancestry and 14,068 cases and 13,104 controls of East Asian ancestry. We identified 65 new loci that are associated with overall breast cancer risk at P < 5 × 10(-8). The majority of credible risk single-nucleotide polymorphisms in these loci fall in distal regulatory elements, and by integrating in silico data to predict target genes in breast cells at each locus, we demonstrate a strong overlap between candidate target genes and somatic driver genes in breast tumours. We also find that heritability of breast cancer due to all single-nucleotide polymorphisms in regulatory features was 2-5-fold enriched relative to the genome-wide ave...
Nature genetics, Jan 23, 2017
Most common breast cancer susceptibility variants have been identified through genome-wide associ... more Most common breast cancer susceptibility variants have been identified through genome-wide association studies (GWAS) of predominantly estrogen receptor (ER)-positive disease. We conducted a GWAS using 21,468 ER-negative cases and 100,594 controls combined with 18,908 BRCA1 mutation carriers (9,414 with breast cancer), all of European origin. We identified independent associations at P < 5 × 10(-8) with ten variants at nine new loci. At P…
Applied and environmental microbiology, May 3, 2017
To our knowledge, fecal microbiota collection methods have not been evaluated in low and middle i... more To our knowledge, fecal microbiota collection methods have not been evaluated in low and middle income countries. Therefore, we evaluated five different fecal sample collections methods for technical reproducibility, stability, and accuracy within the Health Effects of Arsenic Longitudinal Study (HEALS) in Bangladesh. Fifty participants from HEALS provided fecal samples in the clinic which were aliquoted into no solution, 95% ethanol, RNAlater, post-development fecal occult blood test (FOBT) cards, and fecal immunochemical test (FIT) tubes. Half of the aliquots were frozen immediately at -80°C (Day 0) and the remaining samples were left at ambient temperature for 96 hours and then frozen (Day 4). Intraclass correlation coefficients (ICC) were calculated for the relative abundance of the top three phyla, two alpha diversity measures, and four beta diversity measures. The duplicate samples had relatively high ICCs for technical reproducibility at Day 0 and Day 4 (range: 0.79-0.99). Th...
PLOS ONE, 2016
Telomere length is a potential biomarker of aging and risk for age-related diseases. For measurem... more Telomere length is a potential biomarker of aging and risk for age-related diseases. For measurement of relative telomere repeat mass (TRM), qPCR is typically used primarily due to its low cost and low DNA input. But the position of the sample on a plate often impacts the qPCR-based TRM measurement. Recently we developed a novel, probe-based Luminex assay for TRM that requires~50ng DNA and involves no DNA amplification. Here we report, for the first time, a comparison among TRM measurements obtained from (a) two singleplex qPCR assays (using two different primer sets), (b) a multiplex qPCR assay, and (c) our novel Luminex assay. Our comparison is focused on characterizing the effects of sample positioning on TRM measurement. For qPCR, DNA samples from two individuals (K and F) were placed in 48 wells of a 96-well plate. For each singleplex qPCR assay, we used two plates (one for Telomere and one for Reference gene). For the multiplex qPCR and the Luminex assay, the telomere and the reference genes were assayed from the same well. The coefficient of variation (CV) of the TRM for Luminex (7.2 to 8.4%) was consistently lower than singleplex qPCR (11.4 to 14.9%) and multiplex qPCR (19.7 to 24.3%). In all three qPCR assays the DNA samples in the left-and right-most columns showed significantly lower TRM than the samples towards the center, which was not the case for the Luminex assay (p = 0.83). For singleplex qPCR, 30.5% of the variation in TL was explained by column-to-column variation and 0.82 to 27.9% was explained by sample-to-sample variation. In contrast, only 5.8% of the variation in TRM for the Luminex assay was explained by column-to column variation and 50.4% was explained by sample-to-sample variation. Our novel Luminex assay for TRM had good precision and did not show the well position effects of the sample that were seen in all three of the qPCR assays that were tested.
Cancer research, Sep 20, 2016
Identifying genetic variants with pleiotropic associations can uncover common pathways influencin... more Identifying genetic variants with pleiotropic associations can uncover common pathways influencing multiple cancers. We took a two-staged approach to conduct genome-wide association studies for lung, ovary, breast, prostate and colorectal cancer from the GAME-ON/GECCO Network (61,851 cases, 61,820 controls) to identify pleiotropic loci. Findings were replicated in independent association studies (55,789 cases, 330,490 controls). We identified a novel pleiotropic association at 1q22 involving breast and lung squamous cell carcinoma, with eQTL analysis showing an association with ADAM15/THBS3 gene expression in lung. We also identified a known breast cancer locus CASP8/ALS2CR12 associated with prostate cancer, a known cancer locus at CDKN2B-AS1 with different variants associated with lung adenocarcinoma and prostate cancer and confirmed the associations of a breast BRCA2 locus with lung and serous ovarian cancer. This is the largest study to date examining pleiotropy across multiple c...
Clinical Epigenetics, 2016
Background: We examined whether differences in tumor DNA methylation were associated with more ag... more Background: We examined whether differences in tumor DNA methylation were associated with more aggressive hormone receptor-negative breast cancer in an ethnically diverse group of patients in the Breast Cancer Care in Chicago (BCCC) study and using data from The Cancer Genome Atlas (TCGA). Results: DNA was extracted from formalin-fixed, paraffin-embedded samples on 75 patients (21 White, 31 African-American, and 23 Hispanic) (training dataset) enrolled in the BCCC. Hormone receptor status was defined as negative if tumors were negative for both estrogen and progesterone (ER/PR) receptors (N = 22/75). DNA methylation was analyzed at 1505 CpG sites within 807 gene promoters using the Illumina GoldenGate assay. Differential DNA methylation as a predictor of hormone receptor status was tested while controlling for false discovery rate and assigned to the gene closest to the respective CpG site. Next, those genes that predicted ER/ PR status were validated using TCGA data with respect to DNA methylation (validation dataset), and correlations between CpG methylation and gene expression were examined. In the training dataset, 5.7 % of promoter mean methylation values (46/807) were associated with receptor status at P < 0.05; for 88 % of these (38/46), hypermethylation was associated with receptor-positive disease. Hypermethylation for FZD9, MME, BCAP31, HDAC9, PAX6, SCGB3A1, PDGFRA, IGFBP3, and PTGS2 genes most strongly predicted receptor-positive disease. Twenty-one of 24 predictor genes from the training dataset were confirmed in the validation dataset. The level of DNA methylation at 19 out 22 genes, for which gene expression data were available, was associated with gene activity. Conclusions: Higher levels of promoter methylation strongly correlate with hormone receptor positive status of breast tumors. For most of the genes identified in our training dataset as ER/PR receptor status predictors, DNA methylation correlated with stable gene expression level. The predictors performed well when evaluated on independent set of samples, with different racioethnic distribution, thus providing evidence that this set of DNA methylation biomarkers will likely generalize to prospective patient samples.
Nature genetics, Nov 21, 2015
We carried out a trans-ancestry genome-wide association and replication study of blood pressure p... more We carried out a trans-ancestry genome-wide association and replication study of blood pressure phenotypes among up to 320,251 individuals of East Asian, European and South Asian ancestry. We find genetic variants at 12 new loci to be associated with blood pressure (P = 3.9 × 10(-11) to 5.0 × 10(-21)). The sentinel blood pressure SNPs are enriched for association with DNA methylation at multiple nearby CpG sites, suggesting that, at some of the loci identified, DNA methylation may lie on the regulatory pathway linking sequence variation to blood pressure. The sentinel SNPs at the 12 new loci point to genes involved in vascular smooth muscle (IGFBP3, KCNK3, PDE3A and PRDM6) and renal (ARHGAP24, OSR1, SLC22A7 and TBX2) function. The new and known genetic variants predict increased left ventricular mass, circulating levels of NT-proBNP, and cardiovascular and all-cause mortality (P = 0.04 to 8.6 × 10(-6)). Our results provide new evidence for the role of DNA methylation in blood pressu...
Human molecular genetics, Jan 2, 2015
Epidemiological studies have reported inconsistent associations between telomere length (TL) and ... more Epidemiological studies have reported inconsistent associations between telomere length (TL) and risk for various cancers. These inconsistencies are likely attributable, in part, to biases that arise due to post-diagnostic and post-treatment TL measurement. To avoid such biases, we used a Mendelian randomization approach and estimated associations between nine TL-associated SNPs and risk for five common cancer types (breast, lung, colorectal, ovarian and prostate cancer, including subtypes) using data on 51,725 cases and 62,035 controls. We then used an inverse-variance weighted average of the SNP-specific associations to estimate the association between a genetic score representing long TL and cancer risk. The long TL genetic score was significantly associated with increased risk of lung adenocarcinoma (P=6.3x10(-15)), even after exclusion of a SNP residing in a known lung cancer susceptibility region (TERT-CLPTM1L) P=6.6x10(-6)). Under Mendelian randomization assumptions, the asso...
Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology, 2014
Telomeres are tandem repeats of sequences present at the end of the chromosomes that maintain chr... more Telomeres are tandem repeats of sequences present at the end of the chromosomes that maintain chromosomal integrity. After repeated cell division, telomeres shorten to a critical level, triggering replicative senescence or apoptosis, which is a key determinant of cellular aging. Short telomeres also contribute to genome instability and are a hallmark of many cancers. There are several methods for estimating telomere length (TL) from extracted DNA samples. Southern blot is accurate but requires a large quantity of DNA and is expensive. qPCR is cost-effective and requires a small quantity of DNA and is therefore widely used for large-scale epidemiologic studies; however, it typically requires triplicates. We describe a novel multiplexed probe-based non-PCR method for TL measurement. A small amount of DNA (∼50 ng) is hybridized to telomere repeat sequence-specific probes (T) and a reference single gene probes (R). T and R signals are detected from a single reaction well containing the ...
Cancer Prevention Research, 2015
Exposure to toxicants leads to cumulative molecular changes that overtime increase a subject's ri... more Exposure to toxicants leads to cumulative molecular changes that overtime increase a subject's risk of developing urothelial carcinoma. To assess the impact of arsenic exposure at a time progressive manner, we developed and characterized a cell culture model and tested a panel of miRNAs in urine samples from arsenic-exposed subjects, urothelial carcinoma patients, and controls. To prepare an in vitro model, we chronically exposed an immortalized normal human bladder cell line (HUC1) to arsenic. Growth of the HUC1 cells was increased in a time-dependent manner after arsenic treatment and cellular morphology was changed. In a soft agar assay, colonies were observed only in arsenic-treated cells, and the number of colonies gradually increased with longer periods of treatment. Similarly, invaded cells in an invasion assay were observed only in arsenic-treated cells. Withdrawal of arsenic treatment for 2.5 months did not reverse the tumorigenic properties of arsenic-treated cells. Western blot analysis demonstrated decreased PTEN and increased AKT and mTOR in arsenic-treated HUC1 cells. Levels of miR-200a, miR-200b, and miR-200c were downregulated in arsenic-exposed HUC1 cells by quantitative RT-PCR. Furthermore, in human urine, miR-200c and miR-205 were inversely associated with arsenic exposure (P ¼ 0.005 and 0.009, respectively). Expression of miR-205 discriminated cancer cases from controls with high sensitivity and specificity (AUC ¼ 0.845). Our study suggests that exposure to arsenic rapidly induces a multifaceted dedifferentiation program and miR-205 has potential to be used as a marker of arsenic exposure as well as a maker of early urothelial carcinoma detection. Cancer Prev Res; 8(3); 208-21. Ó2015 AACR.
PLoS Genetics, 2014
A large fraction of human genes are regulated by genetic variation near the transcribed sequence ... more A large fraction of human genes are regulated by genetic variation near the transcribed sequence (cis-eQTL, expression quantitative trait locus), and many cis-eQTLs have implications for human disease. Less is known regarding the effects of genetic variation on expression of distant genes (trans-eQTLs) and their biological mechanisms. In this work, we use genome-wide data on SNPs and array-based expression measures from mononuclear cells obtained from a populationbased cohort of 1,799 Bangladeshi individuals to characterize cis-and trans-eQTLs and determine if observed trans-eQTL associations are mediated by expression of transcripts in cis with the SNPs showing trans-association, using Sobel tests of mediation. We observed 434 independent trans-eQTL associations at a false-discovery rate of 0.05, and 189 of these trans-eQTLs were also cis-eQTLs (enrichment P,0.0001). Among these 189 trans-eQTL associations, 39 were significantly attenuated after adjusting for a cis-mediator based on Sobel P,10-5. We attempted to replicate 21 of these mediation signals in two European cohorts, and while only 7 trans-eQTL associations were present in one or both cohorts, 6 showed evidence of cis-mediation. Analyses of simulated data show that complete mediation will be observed as partial mediation in the presence of mediator measurement error or imperfect LD between measured and causal variants. Our data demonstrates that trans-associations can become significantly stronger or switch directions after adjusting for a potential mediator. Using simulated data, we demonstrate that this phenomenon is expected in the presence of strong cis-trans confounding and when the measured cis-transcript is correlated with the true (unmeasured) mediator. In conclusion, by applying mediation analysis to eQTL data, we show that a substantial fraction of observed trans-eQTL associations can be explained by cis-mediation. Future studies should focus on understanding the mechanisms underlying widespread cismediation and their relevance to disease biology, as well as using mediation analysis to improve eQTL discovery.
Toxicology and Applied Pharmacology, 2014
Epidemiologic studies that evaluated genetic susceptibility for the effects of arsenic exposure f... more Epidemiologic studies that evaluated genetic susceptibility for the effects of arsenic exposure from drinking water on subclinical atherosclerosis are limited. We conducted a cross-sectional study of 1078 participants randomly selected from the Health Effects of Arsenic Longitudinal Study in Bangladesh to evaluate whether the association between arsenic exposure and carotid artery intima-media thickness (cIMT) differs by 207 single-nucleotide polymorphisms (SNPs) in 18 genes related to arsenic metabolism, oxidative stress, inflammation, and endothelial dysfunction. Although not statistically significant after correcting for multiple testing, nine SNPs in APOE, AS3MT, PNP, and TNF genes had a nominally statistically significant interaction with well-water arsenic in cIMT. For instance, the joint presence of a higher level of well-water arsenic (≥40.4 μg/L) and the GG genotype of AS3MT rs3740392 was associated with a difference of 40.9 μm (95% CI = 14.4, 67.5) in cIMT, much greater than the difference of cIMT associated with the genotype alone (β = −5.1 μm, 95% CI = −31.6, 21.3) or arsenic exposure alone (β = 7.2 μm, 95% CI = −3.1, 17.5). The pattern and magnitude of the interactions were similar when urinary arsenic was used as the exposure variable. Additionally, the at-risk genotypes of the AS3MT SNPs were positively related to the proportion of monomethylarsonic acid (MMA) in urine, which is indicative of arsenic methylation capacity. The findings provide novel evidence that genetic variants related to arsenic metabolism may play an important role in arsenic-induced subclinical atherosclerosis. Future replication studies in diverse populations are needed to confirm the findings.
Environmental Health Perspectives, 2014
We thank D. Koestler for assistance and helpful discussion regarding the cell type deconvolution ... more We thank D. Koestler for assistance and helpful discussion regarding the cell type deconvolution analyses.