Genetic Variation in H2AFX Contributes to Risk of Non–Hodgkin Lymphoma (original) (raw)

Skip Nav Destination

Research Articles| June 04 2007

Karen L. Novik;

1Genome Sciences Centre and

Search for other works by this author on:

John J. Spinelli;

2Cancer Control Research, British Columbia Cancer Research Centre; Departments of

5Healthcare and Epidemiology,

Search for other works by this author on:

Amy C. MacArthur;

2Cancer Control Research, British Columbia Cancer Research Centre; Departments of

Search for other works by this author on:

Karey Shumansky;

2Cancer Control Research, British Columbia Cancer Research Centre; Departments of

Search for other works by this author on:

Stephen Leach;

1Genome Sciences Centre and

Search for other works by this author on:

Agnes Lai;

2Cancer Control Research, British Columbia Cancer Research Centre; Departments of

Search for other works by this author on:

Joseph M. Connors;

3Medical Oncology and

6Oncology,

Search for other works by this author on:

Randy D. Gascoyne;

4Pathology and Laboratory Medicine, British Columbia Cancer Agency; Departments of

7Laboratory Medicine, and

Search for other works by this author on:

Richard P. Gallagher;

2Cancer Control Research, British Columbia Cancer Research Centre; Departments of

5Healthcare and Epidemiology,

Search for other works by this author on:

Angela R. Brooks-Wilson

1Genome Sciences Centre and

8Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada

Search for other works by this author on:

Crossmark: Check for Updates

Requests for reprints: Angela R. Brooks-Wilson, Genome Sciences Centre, British Columbia Cancer Research Centre, 675 West 10th Avenue, Room 7-111, Vancouver, British Columbia, Canada V5Z 1L3. Phone: 604-675-8156; Fax: 604-675-8178. E-mail: abrooks-wilson@bcgsc.ca

Received: July 31 2006

Revision Received: December 14 2006

Accepted: March 29 2007

Online ISSN: 1538-7755

Print ISSN: 1055-9965

American Association for Cancer Research

2007

Cancer Epidemiol Biomarkers Prev (2007) 16 (6): 1098–1106.

Citation

Karen L. Novik, John J. Spinelli, Amy C. MacArthur, Karey Shumansky, Payal Sipahimalani, Stephen Leach, Agnes Lai, Joseph M. Connors, Randy D. Gascoyne, Richard P. Gallagher, Angela R. Brooks-Wilson; Genetic Variation in H2AFX Contributes to Risk of Non–Hodgkin Lymphoma. _Cancer Epidemiol Biomarkers Prev 1 June 2007; 16 (6): 1098–1106. https://doi.org/10.1158/1055-9965.EPI-06-0639

Download citation file:

Abstract

Non–Hodgkin lymphoma (NHL) comprises a group of lymphoid tumors that have in common somatic translocations. H2AFX encodes a key histone involved in the detection of the DNA double-stranded breaks that can lead to translocations. H2afx is a dosage-dependent gene that protects against B-cell lymphomas in mice, making its human orthologue an ideal candidate gene for susceptibility to lymphoma. We did a population-based genetic association study of H2AFX variants in 487 NHL cases and 531 controls. Complete resequencing of the human H2AFX gene in 95 NHL cases was done to establish the spectrum of variation in affected individuals; this was followed by both direct and indirect tests for association at the level of individual single nucleotide polymorphisms (SNP) and as haplotypes. Homozygosity for the AA genotype of a SNP 417 bp upstream of the translational start of H2AFX is strongly associated [odds ratio (OR), 0.54; P = 0.001] with protection from NHL. We find a strong association of this SNP with the follicular lymphoma subtype of NHL (AA genotype: OR, 0.40; P = 0.004) and with mantle cell lymphoma (AA genotype: OR, 0.20; P = 0.01) that remains significant after adjustment for the false discovery rate, but not with diffuse large B-cell lymphoma. These data support the hypothesis that genetic variation in the H2AFX gene influences genetic susceptibility or resistance to some subtypes of NHL by contributing to the maintenance of genome stability. (Cancer Epidemiol Biomarkers Prev 2007;16(6):1098–106)

Introduction

Non–Hodgkin lymphoma (NHL) comprises a group of solid tumors of lymphoid origin. It is the fourth most common new cancer diagnosis in women and the fifth most common in men in the United States (1) and the seventh most commonly diagnosed cancer worldwide (2). The incidence of NHL has been rising for the past 30 years (3-5). The environmental factors responsible for this increasing incidence have yet to be identified. Determination of the genetic factors that affect susceptibility to NHL may provide clues to allow new environmental risk factors to be identified. Genetic factors are therefore of increasing importance in efforts to prevent or control this cancer.

NHL is divided into subtypes that have histologically and clinically diverse presentation, disease progression, and outcome. Most cases are sporadic. Many types of NHL tumors are known to have characteristic translocations that generally juxtapose proliferation-associated or apoptosis-inhibiting genes with immunoglobulin-related loci that are active in B or T cells. The tendency for NHL tumors to have specific translocations may imply an underlying defect in the cellular systems that protect against the occurrence of translocations. It has been suggested that defects in immunoglobulin switching may be involved in susceptibility to lymphoid cancers (6). Given that not all NHL tumor translocations involve immunoglobulin genes, however, genes more generally involved in protection against DNA double-stranded breaks (DSB) may also play a role in NHL susceptibility. Consistent with this idea, loss-of-function mutations in the ATM gene, a central player in the detection and response to DNA DSBs, cause the autosomal recessive syndrome ataxia telangiectasia, a feature of which is susceptibility to cancer, particularly lymphoma. Genetic variants in genes involved in DNA DSB repair are therefore candidates for susceptibility factors for NHL.

H2AFX, or H2AX, is a member of the histone H2A gene family and is fundamental to the detection of and response to DNA DSBs (7). H2AFX has a serine residue near its COOH terminus that is rapidly phosphorylated when cells are exposed to DNA damage (8). ATM initiates H2AFX phosphorylation in response to ionizing radiation (9, 10). Many factors associated with the DNA damage response (e.g., ATM, BRCA1, RAD51, and the MRE11/RAD50/NBS1 complex) are found to colocalize with phosphorylated H2AFX (γ-H2AFX) at the sites of DSBs. Although γ-H2AFX is not essential for DSB repair, it may modulate the process by reorganizing chromatin and preventing the separation of broken DNA ends, thus facilitating the concentration of repair factors and complexes at the lesion (7).

H2afx knockout mice are radiation sensitive, growth retarded, and immunodeficient and display chromosomal instability and repair defects (11). These mice have a high incidence of B-cell lymphomas that develop at an accelerated rate on a p53-deficient background (12, 13). Tumors derived from such animals show translocations. H2AFX is a genomic caretaker that requires the function of both alleles for optimal protection against tumorigenesis (12, 13). A dosage-sensitive gene, such as H2AFX, for which activity could vary due to genetic variants in the gene or its regulatory sequences, is an ideal candidate for a lymphoma susceptibility gene.

NHL, like most cancers, is a complex genetic disease and, as such, is suited to the application of genetic association studies. We have used 487 NHL cases and 531 controls from a population-based case-control study to test for association between genetic variants in the H2AFX gene with NHL and with individual subtypes of NHL. Genetic association tests were preceded by single nucleotide polymorphism (SNP) discovery resequencing of the H2AFX gene in a subset of NHL cases to validate and estimate the frequency of genetic variants in the gene.

Materials and Methods

Study Population

All NHL cases ages 20 to 79 diagnosed in British Columbia during the period March 2000 to February 2004 and residing in the greater Vancouver (Greater Vancouver Regional District) and greater Victoria (Capital Regional District) metropolitan areas were identified from the British Columbia Cancer Registry and invited to participate. HIV-positive cases and those who were unable to give informed consent were excluded. Controls were obtained from the Client Registry of the British Columbia Ministry of Health, which includes virtually all British Columbia residents, and were frequency matched to cases by age (within 5-year age group), sex, and residence within the Greater Vancouver Regional District or Capital Regional District. The participation rates of eligible cases and controls were 82% and 47%, respectively. A computer-aided telephone interview questionnaire administered to participating cases and controls included each participant's report of the ethnicity of his or her four grandparents, demographic information, medical history, and other variables. Subjects provided a blood sample or, in 10% of subjects, a mouthwash sample. Full clinical, pathologic, treatment, and outcome data on all cases are maintained within medical records at the British Columbia Cancer Agency. A histologic review of all lymphomas included in this study was conducted by the provincial reference lymphoma pathologist (R.D.G.). This study was approved by the joint University of British Columbia/British Columbia Cancer Agency Research Ethics Board. Table 1 summarizes the characteristics of the 487 cases and 531 control subjects included in this analysis.

Table 1.

Characteristics of the case-control samples

Case, n (%) Control, n (%) Total SNP2 (n = 947) MAF (allele A) SNP6 (n = 932) MAF (allele T) SNP7 (n = 937) MAF (allele G)
Gender
Female 208 (42.7) 244 (46.0) 452 0.44 0.39 0.04
Male 279 (57.3) 287 (54.0) 566 0.48 0.37 0.05
Age group (y)
20-49 90 (18.5) 101 (19.0) 191 0.44 0.35 0.05
50-59 117 (24.0) 117 (22.0) 234 0.45 0.38 0.04
60-69 123 (25.3) 142 (26.7) 265 0.48 0.39 0.04
70+ 157 (32.2) 171 (32.2) 328 0.47 0.39 0.04
Ethnicity
Caucasian 387 (79.5) 420 (79.1) 807 0.44 0.37 0.03
Asian 48 (9.9) 59 (11.1) 107 0.66 0.48 0.13
South Asian 14 (2.9) 24 (4.5) 38 0.51 0.35 0.06
Mixed/other 23 (4.7) 17 (3.2) 40 0.38 0.29 0.05
Unknown/refused 15 (3.1) 11 (2.1) 26 0.42 0.37 0.09
Region
Vancouver 402 (82.5) 408 (76.8) 810 0.46 0.37 0.05
Victoria 85 (17.5) 123 (23.2) 208 0.48 0.41 0.03
NHL subtypes
B-cell lymphomas 442 (90.8) 0.43 0.37 0.04
DLBC 124 (25.5) 0.47 0.41 0.04
FSC* 92 (18.9) 0.38 0.33 0.04
FM/FL* 48 (9.9) 0.42 0.37 0.04
MZ/MALT 48 (9.9) 0.41 0.39 0.03
MCLD 31 (6.4) 0.29 0.21 0.09
SLL 21 (4.3) 0.53 0.50 0.00
LPL 23 (4.7) 0.55 0.38 0.00
Miscellaneous BCL 55 (11.3) 0.40 0.38 0.03
T-cell lymphomas 44 (9.0) 0.41 0.32 0.05
MF 23 (4.7) 0.55 0.43 0.05
PTCL 15 (3.1) 0.29 0.19 0.04
Miscellaneous TCL 6 (1.2) 0.17 0.17 0.08
Miscellaneous/NOS 1 (0.2) 0.00 0.00 0.00
Total 487 531 1,018
Case, n (%) Control, n (%) Total SNP2 (n = 947) MAF (allele A) SNP6 (n = 932) MAF (allele T) SNP7 (n = 937) MAF (allele G)
Gender
Female 208 (42.7) 244 (46.0) 452 0.44 0.39 0.04
Male 279 (57.3) 287 (54.0) 566 0.48 0.37 0.05
Age group (y)
20-49 90 (18.5) 101 (19.0) 191 0.44 0.35 0.05
50-59 117 (24.0) 117 (22.0) 234 0.45 0.38 0.04
60-69 123 (25.3) 142 (26.7) 265 0.48 0.39 0.04
70+ 157 (32.2) 171 (32.2) 328 0.47 0.39 0.04
Ethnicity
Caucasian 387 (79.5) 420 (79.1) 807 0.44 0.37 0.03
Asian 48 (9.9) 59 (11.1) 107 0.66 0.48 0.13
South Asian 14 (2.9) 24 (4.5) 38 0.51 0.35 0.06
Mixed/other 23 (4.7) 17 (3.2) 40 0.38 0.29 0.05
Unknown/refused 15 (3.1) 11 (2.1) 26 0.42 0.37 0.09
Region
Vancouver 402 (82.5) 408 (76.8) 810 0.46 0.37 0.05
Victoria 85 (17.5) 123 (23.2) 208 0.48 0.41 0.03
NHL subtypes
B-cell lymphomas 442 (90.8) 0.43 0.37 0.04
DLBC 124 (25.5) 0.47 0.41 0.04
FSC* 92 (18.9) 0.38 0.33 0.04
FM/FL* 48 (9.9) 0.42 0.37 0.04
MZ/MALT 48 (9.9) 0.41 0.39 0.03
MCLD 31 (6.4) 0.29 0.21 0.09
SLL 21 (4.3) 0.53 0.50 0.00
LPL 23 (4.7) 0.55 0.38 0.00
Miscellaneous BCL 55 (11.3) 0.40 0.38 0.03
T-cell lymphomas 44 (9.0) 0.41 0.32 0.05
MF 23 (4.7) 0.55 0.43 0.05
PTCL 15 (3.1) 0.29 0.19 0.04
Miscellaneous TCL 6 (1.2) 0.17 0.17 0.08
Miscellaneous/NOS 1 (0.2) 0.00 0.00 0.00
Total 487 531 1,018

NOTE: P values for differences in socio-demographic characteristics between cases and controls were as follows: gender, 0.30; age group, 0.88; ethnicity, 0.32; and region, 0.024. All analyses have been adjusted for these four characteristics, however, so the slight regional difference between cases and controls will not affect our results.

Abbreviations: DLBC, diffuse large T-cell lymphoma; FSC, follicular small cell lymphoma; FM, follicular mixed cell; MZL/MALT, marginal zone lymphoma/mucosa-associated lymphoid tissue lymphomas; MCLD, mantle cell lymphoma diffuse; SLL, small lymphocytic lymphoma; LPL, lymphoplasmacytic lymphoma; MF, mycosis fungoides; PTCL, peripheral T-cell lymphoma; BCL, B-cell lymphoma; TCL, T-cell lymphoma; NOS, not otherwise specified; MAF, minor allele frequency.

*

Follicular small cell lymphoma and follicular mixed cell/FL were combined for analysis as follicular lymphoma.

DNA Extraction from Case/Control Samples

Genomic DNA was extracted from whole blood and mouthwash samples using the PureGene DNA isolation kit according to the manufacturer's protocols (Gentra Systems). DNA samples were quantified by fluorometry using PicoGreen (Molecular Probes) and a Victor2 fluorescence plate reader (Perkin-Elmer).

Resequencing of the H2AFX Gene

Eight overlapping PCR primer pairs were designed to span the promoter and single exon of the H2AFX gene, including 1,000 bp upstream and 50 bp downstream of the transcribed region. Primer design was done using the H2AFX genomic sequence (accession number BC013416) and Primer3 (primer3_www.cgi v 0.2).9

Forward and reverse primers incorporated M13 forward or M13 reverse extensions, respectively, at their 5′ ends. The sequences and annealing temperatures of all primers used in bidirectional SNP discovery sequencing are available in Supplementary Table S1. PCR sequencing and sequence analysis procedures were carried out as described previously (14). Haploview (15) was used to calculate and display inter-SNP linkage disequilibrium information derived from sequence data.

SNP Genotyping

Taqman allelic discrimination assays were designed using the Assays-by-Design service (Applied Biosystems). The sequences of primers and probes are shown in Supplementary Table S2. Assays were done in 384-well format with up to 9.6 ng of genomic DNA, 2.5 μL of 2× Taqman Universal PCR Master Mix, 0.125 μL of 40× Taqman probes/primers mix, and 2.375 μL H2O in a total reaction volume of 5 μL. Cycling conditions in the ABI PRISM 7900HT instrument were 95°C for 10 min followed by 40 cycles of 92°C for 15 s and 60°C for 1 min. Fluorescence data were collected post-PCR and analyzed using the SDS2.2 program (Applied Biosystems).

Statistical Analysis of Genotype and Haplotype Data

Statistical analysis of genotype data was done using standard methods for case-control studies (16). Tests for trend were done when five or more rare homozygous alleles were present. The primary analyses used logistic regression models to estimate the odds ratio (OR) for the development of NHL with 95% confidence intervals (95% CI) for each of the identified SNPs. ORs were adjusted for age (20-49, 50-59, 60-69, and 70+ years), sex, place of residence, and ethnicity (Caucasian, Asian, South Asian, and mixed/other/unknown). Tests for Hardy-Weinberg equilibrium (HWE) were also carried out. Statistical analysis of the haplotypes was done using The Haplo.stats Package (17), which uses an expectation-maximization–based maximum likelihood approach to simultaneously do haplotype reconstruction by estimation of haplotype weights and estimate ORs and to account for haplotype uncertainty in the risk estimates. Case/control status, non-SNP variables, and diallelic SNP data are used to best estimate the haplotype weights. To evaluate the robustness of risk estimates, we computed the false discovery rate (18). The false discovery rate reflects the expected proportion of false-positive findings to the total number of significant findings and was computed using the P trends for each genotype (n = 3 comparisons) for P < 0.05.

Methylation Analysis

We assessed the methylation state of one H2AFX SNP in blood genomic DNA using a methylation-sensitive _Hpa_II restriction assay. SNP2 is within a recognition site (CCGG) for _Hpa_II. If the C residue of the CpG dinucleotide within the _Hpa_II site is methylated, digestion cannot occur; if it is unmethylated, digestion can take place. _Hpa_II digestion of genomic DNA followed by PCR using primers flanking SNP2 was used to test the methylation status of this _Hpa_II site, which was the only one within the PCR amplicon. Production of PCR product indicates that the CpG site of SNP2 was methylated.

Results

Discovery and Validation of H2AFX Variants in Constitutional DNA of NHL Cases

Bidirectional sequencing was used to detect and validate genetic variants in the H2AFX gene, including 1,000 bp upstream and 50 bp downstream of the transcribed region. Blood DNA samples from 95 young NHL cases (ages 20-59) were used, with the rationale that younger patients are more likely than older ones to have a greater genetic component to their disease. Given that NHL is a complex disease involving multiple genetic and environmental factors, we expect that these 95 cases represent an etiologically heterogeneous group and that, at most, only a subset of them may have developed NHL because of defects in H2AFX. For this reason, we expect that normal copies of the H2AFX gene are represented within this set of individuals. SNP discovery was done in affected individuals to maximize the chances of detecting NHL-relevant variants. Comparison with control samples was not done at the sequencing stage of this study.

Figure 1 illustrates the structure of H2AFX, a single-exon gene at 11q23.3, and summarizes its genetic variants and their frequencies. Seven SNPs were detected: three 5′ to the transcribed region, three in the 3′-untranslated region, and one in the 3′ flanking region. No insertions, deletions, or microsatellite sequences were noted in the 2,700 bp region sequenced. All but two of the seven SNPs had a minor allele frequency (MAF) >30%. SNP7 had a MAF of 6% and SNP3 was rare, with MAF 0.5% in our sequence data. The six most frequent SNPs were represented in dbSNP.10

Corresponding dbSNP accession numbers are shown.

Figure 1.

Figure 1. Structure of the single-exon H2AFX gene indicating the positions of seven verified SNPs. The transcribed region is boxed, with the coding region in black and the 5′-untranslated region and 3′-untranslated region hatched. SNP locations are indicated relative to the start codon, with the A of the initiator codon as position +1. Major alleles are indicated to the left and minor alleles to the right based on the allele frequency observed in the SNP discovery set of 95 individuals. dbSNP accession numbers are shown where applicable. The diagram is not to exact scale.

Structure of the single-exon H2AFX gene indicating the positions of seven verified SNPs. The transcribed region is boxed, with the coding region in black and the 5′-untranslated region and 3′-untranslated region hatched. SNP locations are indicated relative to the start codon, with the A of the initiator codon as position +1. Major alleles are indicated to the left and minor alleles to the right based on the allele frequency observed in the SNP discovery set of 95 individuals. dbSNP accession numbers are shown where applicable. The diagram is not to exact scale.

Figure 1.

Figure 1. Structure of the single-exon H2AFX gene indicating the positions of seven verified SNPs. The transcribed region is boxed, with the coding region in black and the 5′-untranslated region and 3′-untranslated region hatched. SNP locations are indicated relative to the start codon, with the A of the initiator codon as position +1. Major alleles are indicated to the left and minor alleles to the right based on the allele frequency observed in the SNP discovery set of 95 individuals. dbSNP accession numbers are shown where applicable. The diagram is not to exact scale.

Structure of the single-exon H2AFX gene indicating the positions of seven verified SNPs. The transcribed region is boxed, with the coding region in black and the 5′-untranslated region and 3′-untranslated region hatched. SNP locations are indicated relative to the start codon, with the A of the initiator codon as position +1. Major alleles are indicated to the left and minor alleles to the right based on the allele frequency observed in the SNP discovery set of 95 individuals. dbSNP accession numbers are shown where applicable. The diagram is not to exact scale.

Close modal

Resequencing data are summarized in Fig. 2A, which indicate the genotype of each of the 95 samples sequenced at each of the seven SNPs. Figure 2A was generated using the Visual Genotype (VG2) program11

(19, 20). Linkage disequilibrium is apparent, but incomplete, between the five most common SNPs as illustrated by the co-occurrence of heterozygosity (or homozygosity for the minor allele) at these SNPs in many but not all samples. SNP7 alleles are not consistently seen in association with specific alleles of the other SNPs. SNP3 seems to have arisen on the background of a haplotype containing the major alleles of the other six SNPs. Figure 2B shows the inter-SNP linkage disequilibrium calculated using Haploview (15) for this set of 95 samples. SNP3 was excluded from this analysis. A block of linkage disequilibrium (_r_2 ≥ 0.67) encompasses SNP1, SNP2, SNP4, SNP5, and SNP6 but excludes SNP7. Linkage disequilibrium is highest between SNP1, SNP4, SNP5, and SNP6; linkage disequilibrium between SNP2 and these other four SNPs is lower (_r_2 = 0.67-0.75), perhaps indicating that SNP2 is a different age (likely younger) than the other four SNPs in this region of linkage disequilibrium. Additional genetic variants in H2AFX not seen in our resequenced samples are reported in dbSNP.12

Many of these were observed in individuals of African ancestry, however, an ethnic group that is very uncommon in our study population.

Figure 2.

Figure 2. A. Genotype data derived from resequencing of H2AFX in 95 individual cases sorted based on ethnicity were summarized using the Visual Genotype (VG2) program. B. H2AFX inter-SNP linkage disequilibrium. r2 between pairs of SNPs was calculated based on the 95 resequenced individual cases using Haploview v3.11 (http://www.broad.mit.edu/mpg/haploview). SNP3 was excluded from the analysis because of its low frequency. SNP7 is not in linkage disequilibrium with any of the other SNPs. A block of high linkage disequilibrium, including SNP1, SNP2, SNP3, SNP4, SNP5, and SNP6, is outlined in black.

A. Genotype data derived from resequencing of H2AFX in 95 individual cases sorted based on ethnicity were summarized using the Visual Genotype (VG2) program. B. H2AFX inter-SNP linkage disequilibrium. _r_2 between pairs of SNPs was calculated based on the 95 resequenced individual cases using Haploview v3.11 (http://www.broad.mit.edu/mpg/haploview). SNP3 was excluded from the analysis because of its low frequency. SNP7 is not in linkage disequilibrium with any of the other SNPs. A block of high linkage disequilibrium, including SNP1, SNP2, SNP3, SNP4, SNP5, and SNP6, is outlined in black.

Figure 2.

Figure 2. A. Genotype data derived from resequencing of H2AFX in 95 individual cases sorted based on ethnicity were summarized using the Visual Genotype (VG2) program. B. H2AFX inter-SNP linkage disequilibrium. r2 between pairs of SNPs was calculated based on the 95 resequenced individual cases using Haploview v3.11 (http://www.broad.mit.edu/mpg/haploview). SNP3 was excluded from the analysis because of its low frequency. SNP7 is not in linkage disequilibrium with any of the other SNPs. A block of high linkage disequilibrium, including SNP1, SNP2, SNP3, SNP4, SNP5, and SNP6, is outlined in black.

A. Genotype data derived from resequencing of H2AFX in 95 individual cases sorted based on ethnicity were summarized using the Visual Genotype (VG2) program. B. H2AFX inter-SNP linkage disequilibrium. _r_2 between pairs of SNPs was calculated based on the 95 resequenced individual cases using Haploview v3.11 (http://www.broad.mit.edu/mpg/haploview). SNP3 was excluded from the analysis because of its low frequency. SNP7 is not in linkage disequilibrium with any of the other SNPs. A block of high linkage disequilibrium, including SNP1, SNP2, SNP3, SNP4, SNP5, and SNP6, is outlined in black.

Close modal

Genotyping of H2AFX SNPs

SNP3 was excluded from genotyping due to its rarity. Taqman assays were successfully designed for SNP2, SNP6, and SNP7; assay design failed for SNP1, SNP4, and SNP5 (see Supplementary Table S2 for a summary of Taqman assays and reaction conditions). We speculate that the high GC content of the 5′ region may have interfered with assay design for SNP1 and that the sequence similarity between different histone genes may have complicated design of gene-specific assays for SNP4 and SNP5. Based on our sequence-based linkage disequilibrium data, however, SNP6 is expected to be a good tagSNP (21) to act as a proxy for SNP1, SNP4, and SNP5 because linkage disequilibrium between these four variants is high. SNP2, SNP6, and SNP7 are expected to act as good tagSNPs to represent all of the variants detected in H2AFX by resequencing in individuals from our population of interest. SNP2, SNP6, and SNP7 were genotyped in the set of 487 cases and 531 controls. All three SNPs had a genotyping failure rate of 9% to 11% due to low template DNA quantity for a subset of the samples. The DNA used for the study was high-quality, high-molecular-weight DNA; however, some samples were limited in quantity. DNAs were quantitated by fluorometry using PicoGreen dye; this procedure cannot reliably quantitate DNA concentrations below 0.25 ng/μL, although these DNAs can be used successfully for PCR-based techniques. Genotype call rates correlated with amount of template DNA used. Samples for which 1.25 ng or more of DNA were used in each genotyping reaction had a call rate of 97%; samples genotyped using less template had a call rate of 64%. The genotype call rate did not differ between cases and controls. Assay validation and monitoring of data quality is particularly important in case/control studies where family relationships cannot be used to deduce inconsistencies in the genotype data. For this reason, the 95 DNA samples used in SNP discovery phase were also genotyped. The data derived from sequencing and the genotyping of these 95 samples were compared and one inconsistency was resolved before the data were used in statistical analyses. DNA samples from members of five large three-generation families from the Centre d'Etude du Polymorphisme Humain collection, purchased from Coriell Cell Repositories, were also genotyped. Verification of Mendelian inheritance in these families provided an additional quality control measure.

HWE was tested in controls only, using exact methods (22) in each major ethnic group in the study (Caucasians, Asians, and South Asians) separately, for each of the three SNPs genotyped. The ethnic distribution of study participants corresponds to that of the geographic areas in which sample collection took place (Vancouver and Victoria, British Columbia). SNP2, for which we saw the major positive results of this study, was in HWE in each of the three ethnic groups. SNP6 showed a barely significant deviation from HWE in Caucasians (P = 0.044), and SNP7 deviated from HWE in Asians (P = 0.025) and South Asians (P = 0.025). These deviations do not remain significant, however, when correction is made for multiple comparisons. The deviations from HWE of SNP7, which is rare, in the two smallest groups may not be surprising. Many of the Asians and South Asians who come to British Columbia are from different countries and regions. This recently immigrated group may not yet be in population equilibrium (“random mating”), so HWE may not be a valid expectation in this case, particularly for low-frequency SNPs.

Association Tests of H2AFX SNPs

Table 2 summarizes the results of association tests of SNP2, SNP6, and SNP7 with NHL. These multivariate analyses were adjusted for age, sex, ethnicity, and region of residence. The AA genotype of SNP2 is associated with protection from NHL, with an OR of 0.54 (95% CI, 0.37-0.79; P = 0.001), indicating that this genotype is associated with a near halving of risk for NHL. Heterozygosity at SNP2 gave an OR of 0.92 that was not statistically significant, although the trend was significant (P = 0.003). SNP6 and SNP7 did not show association with NHL.

Table 2.

ORs and 95% CIs for association tests of SNPs with NHL

Table 2 also summarizes the results of association tests of SNP2, SNP6, and SNP7 with some subtypes of NHL as well as groups of subtypes. The AA genotype of SNP2 was also inversely associated with a group of cases encompassing all B-cell NHL cases, including follicular lymphoma (FL) and diffuse large B-cell lymphoma (DLBC), with OR of 0.55 (95% CI, 0.37-0.81; P = 0.002). This association with B-cell NHL may be driven by inclusion of FL, which, on its own, showed a significant inverse association of the AA genotype with OR of 0.40 (95% CI, 0.21-0.74; P = 0.004). Another B-cell subtype of NHL, mantle cell lymphoma (MCL), also gave a significant association with OR of 0.20 (95% CI, 0.05-0.72; P = 0.01), although the numbers of samples available for analysis of this less common subtype were limited. The last analysis that produced a significant association, again with the SNP2 AA genotype, was “all other NHL,” a group including all NHL cases except the most common two types FL and DLBC. This association may be driven by inclusion of MCL in this heterogeneous group. In spite of having similar numbers of cases as FL, and therefore comparable power to detect association, DLBC did not show association with any of the SNPs tested. SNP6 was only associated with MCL; SNP7 was not associated with NHL or any subtype thereof.

These results show that the association of H2AFX SNP2 genotypes is subtype specific and is evident for FL and possibly MCL but not DLBC. The P values for trend remained significant even after adjustment for the false discovery rate for SNP2 with NHL, all B-cell NHL, all other NHL without FL, and DLBC, FL, and MCL.

T-cell types of NHL were not associated with H2AFX SNPs, either as a group (all T-cell NHL together) or as separate subtypes. Other NHL subtypes (marginal zone lymphoma/mucosa-associated lymphoid tissue, small lymphocytic lymphoma, lymphoplasmacytic lymphoma, miscellaneous B-cell lymphomas, mycosis fungoides, peripheral T-cell lymphoma, and miscellaneous T-cell lymphomas) were also tested for association with H2AFX SNPs (see Supplementary Table S3), but the borderline P values noted in two comparisons did not remain significant after adjustment for the false discovery rate.

We also tested the association of SNP2 with NHL separately in Caucasians, Asians, and South Asians and found the protective effect of homozygosity for the A allele to be statistically significant only in Caucasians (OR, 0.56; 95% CI, 0.37-0.85; P = 0.007; data not shown). This is not surprising as Caucasians account for 81% of our study population, and the smaller number of samples in the other ethnic groups significantly reduces the statistical power to detect association. The association of SNP2 with NHL is unlikely to be due to population stratification in this study because detailed ethnicity data (of all four grandparents of each study subject) were used to divide groups for separate analysis or for adjustment in combined analyses.

Haplotype Analysis of H2AFX

Additional methodologic controls in the form of 72 normal reference DNA samples (derived from members of five three-generation Centre d'Etude du Polymorphisme Humain reference families) were also genotyped. Inclusion of these samples provides a consistency check for probabilistic haplotype analysis in the case/control samples. Genotypes from the families were used to deduce the coinheritance of SNP markers and haplotypes. Haplotypes determined in the Centre d'Etude du Polymorphisme Humain families corresponded in every case to common haplotypes deduced by probabilistic methods.

Table 3 summarizes the results of association tests of haplotypes predicted using the Haplo.stats Package (17) of H2AFX SNP2, SNP6, and SNP7 with NHL. Additional analyses were done using the hapassoc and PHASE 2.0 programs,13

13

Shumansky and Spinelli, in preparation.

with comparable results. Three common haplotypes (GCA, ACA, and ATA) and four rare haplotypes were inferred. Rare haplotypes (with individual frequencies of <5%) were combined in the association test. A significant protective effect was observed for haplotypes ACA (OR, 0.51; 95% CI, 0.35-0.75; P = 0.001) and ATA (OR, 0.80; 95% CI, 0.66-0.97; P = 0.03) with NHL. All three common haplotypes (including the referent haplotype) contained the A allele of SNP7, making SNP7 unlikely to contribute to the protective effects of these haplotypes. We therefore also tested the association of the haplotypes formed by SNP2 and SNP6 only with NHL (Table 3). The haplotypes AC and AT were both significantly protective against NHL, a result that may be driven by the presence of the SNP2 A allele in all haplotypes inversely associated with NHL. The haplotype-based association test results in different NHL subtypes and groups paralleled those seen with SNP2, with positive associations seen in NHL, all B-cell NHL, all other B-cell NHL without FL and DLBC, FL, and MCL, but not DLBC (Table 3).

Table 3.

Association tests of haplotypes of H2AX SNPs with NHL and NHL subtypes

Haplotype association analysis of H2AFX showed that the A allele of SNP2, along with both the C and T alleles of SNP6, is significantly associated with protection from NHL and specifically from the FL subtype. We have already described how individuals who are homozygous AA for SNP 2 alone are protected against NHL and specifically FL and possibly MCL (Table 2). No significant association was found when SNP6 was analyzed alone, however (Table 2). This suggests that the major protective effect in this haplotype analysis is derived from the A allele of SNP2. Linkage disequilibrium between SNP2 and SNP6 is incomplete (_r_2 = 0.70; Fig. 2B), suggesting either that the ages of these variants are different or that some ancestral recombination has occurred between them.

Evolutionary Conservation and the Origin of H2AFX SNP2

The SNP2 G/A variant corresponds to the G residue of a CpG dinucleotide. The C residue of a CpG dinucleotide can be methylated through epigenetic regulatory system that affects gene expression in mammals. SNP2 is near the 5′ end of an extensive CpG island detected using CpG Island Searcher14

(23) that spans the 5′-untranslated region, promoter, and entire transcribed region of H2AFX (data not shown). Methylation of cytosine creates 5-methylcytosine (24, 25), which can spontaneously deaminate to thymine. Consequently, C to T transitions accumulate over evolutionary time scales (26, 27). If a methylcytosine residue deaminates to thymine, it will be paired with adenine on the opposite strand and the original CpG dinucleotide will become CpA. This process can create a G/A SNP immediately following a C nucleotide.

To gain insight into the origin of SNP2, we did a multispecies alignment (Fig. 3) of the sequence surrounding this polymorphic site. Sequences from one gorilla, one orangutan, one baboon, one gibbon, and 20 chimpanzees were generated using primate DNA samples using the same primers and reaction conditions for human H2AFX amplicon 3. Although not transcribed, this region is highly conserved. All primate sequences have an A at the position of human SNP2; dog shows a C and mouse and rat both have a G nucleotide. All 20 chimp sequences were identical, so it is unlikely that this position corresponds to a common polymorphism in the chimpanzee. Given that only one DNA sample was compared for each of the other species, we cannot assume that this site is invariant in these species. In no genome, except human, however, was this position annotated as being polymorphic.

Figure 3.

Figure 3. Multispecies alignment of the region surrounding H2AFX SNP2 created using ClustalW (http://www.ebi.ac.uk/clustalw). Accession numbers are shown where applicable; gorilla, orangutan, baboon, and gibbon were sequenced in this study. Asterisk, a position is identical in all species shown. R, denotes an A or G residue.

Multispecies alignment of the region surrounding H2AFX SNP2 created using ClustalW (http://www.ebi.ac.uk/clustalw). Accession numbers are shown where applicable; gorilla, orangutan, baboon, and gibbon were sequenced in this study. Asterisk, a position is identical in all species shown. R, denotes an A or G residue.

Figure 3.

Figure 3. Multispecies alignment of the region surrounding H2AFX SNP2 created using ClustalW (http://www.ebi.ac.uk/clustalw). Accession numbers are shown where applicable; gorilla, orangutan, baboon, and gibbon were sequenced in this study. Asterisk, a position is identical in all species shown. R, denotes an A or G residue.

Multispecies alignment of the region surrounding H2AFX SNP2 created using ClustalW (http://www.ebi.ac.uk/clustalw). Accession numbers are shown where applicable; gorilla, orangutan, baboon, and gibbon were sequenced in this study. Asterisk, a position is identical in all species shown. R, denotes an A or G residue.

Close modal

If SNP2 arose by a methylation-based mechanism, we would expect the G form to be the ancestral allele. Given that mouse and rat have a G residue at the position of SNP2, it is possible that the nearest common ancestor of these mammals bore a G at this position and that most primates and dog mutated at this position to A and C, respectively. The opposite hypothesis is also possible, that the original form in a common primate ancestor was “A” and that a G allele has arisen in the human population. Mutation from G to A at this position is likely to occur much more frequently, however, than mutation from A to G (28), given the effect of cytosine methylation. The position of SNP2 upstream of the H2AX transcription start within a CpG island and conserved sequence region supports its possible relevance to the regulation of this gene.

Methylation Analysis of SNP2 in Blood DNA of Cases and Controls

If the C residue adjacent to SNP2 were highly methylated, this could lead to relative transcriptional silencing of H2AFX and compromise of DNA DSB repair. We assessed the methylation state of this residue in blood genomic DNA using a methylation-sensitive _Hpa_II restriction assay. AA homozygotes were excluded because the A allele is not a _Hpa_II restriction site. AG heterozygotes were excluded for simplicity of interpretation of the assay. The methylation status of all GG homozygotes in the case/control group was assayed using blood genomic DNA. Only 5 (4 cases and 1 control) of 247 samples tested gave a PCR product after digestion (data not shown), indicating that this site seems unmethylated in blood genomic DNA in a large proportion of both case and control samples. Given that blood genomic DNA is derived from a mix of many different cell types, this experiment likely indicates that the H2AFX SNP2 CpG site is unmethylated in the most abundant DNA-containing cell types in blood. However, this result may not reflect the methylation status of this site in the cell types of origin of NHL, FL, or MCL.

Discussion

This study seeks to address the effect of genetic variation throughout H2AFX gene by cataloguing the variants present in NHL cases and conducting association tests of SNPs and haplotypes in NHL cases and controls. This strategy is in contrast to studies of candidate polymorphisms, which examine one or a few potentially functionally relevant variants in one or several genes. Although candidate polymorphism studies can be fruitful and economical, they can result in prematurely dismissing genes as candidates for disease if a causal SNP is not genotyped in the study and is not in high linkage disequilibrium with a genotyped SNP. Our strategy aims to address association at the level of the gene.

We have shown that a G/A SNP 417 bp upstream of the start codon of H2AFX is associated with NHL. The AA genotype is associated with protection from lymphoma; conversely, the GG genotype can be considered to increase risk. The association is statistically significant in the overall NHL case/control comparison as well as in Caucasian cases and controls. Additional studies in other populations will be required to determine if this variant site has an effect in other ethnicities. When the most common two types of NHL are assessed separately, the association is present in FL but absent in DLBC. An additional significant association with MCL is based on a small numbers of cases. The numbers of cases with other NHL types are too small for this study to provide a definitive assessment of them. Analysis of less common NHL types will likely require the pooling of data from multiple lymphoma population studies. Larger studies will also be required to determine if SNP2 heterozygotes have intermediate level of protection. The power of data pooling in association studies was exemplified recently for the TNF and IL-10 genes (29).

HapMap (30) data show that linkage disequilibrium is very high for at least 40 kb upstream of H2AFX, so, alternatively, the SNP2 association could be due to a gene in linkage disequilibrium with H2AFX. Genotyping of additional SNPs in this 40 kb region will help to clarify the SNP2 association.

H2AFX is critical for protection against DNA DSBs that can lead to lymphoma. Our observation that a DNA variant in the human H2AFX gene is associated with protection/risk of lymphoma makes sense in light of the observation that haploinsufficiency for this gene leads to B-cell lymphomas in mice (12, 13). Genes that show haploinsufficiency (dosage dependence) are particularly intriguing as candidates for human disease variants; genetic variants causing slight variation in the activity or expression level of such genes could contribute to disease susceptibility in human populations.

We speculate that the AA genotype of SNP2 is associated with greater transcription of this DNA repair gene in a cell type or developmental compartment relevant for lymphomagenesis and that the protective “A” allele is less easily silenced by normal or aberrant methylation. Determination of whether SNP2 genotype correlates with H2AFX expression level or function will be important for understanding the effect of this variant. Given that the SNP2 association is observed in FL but not in DLBC, the relevant cell type may be follicle center B cells that are undergoing immunoglobulin class switching or somatic Ig mutation. The observation that the B-cell lymphomas of _H2afx_-deficient mice have predominantly IgH/c-myc translocations is consistent with such a hypothesis (12, 13). If the A allele were advantageous for DNA repair, however, why would the G allele be more common? The G allele, which may allow methylation of the adjacent C residue, could offer a different advantage by allowing more finely tuned control of the times and precise compartment of expression of H2AFX. The immune system, which needs to be exquisitely responsive at certain times (to maximize the immunoglobulin diversity necessary to resist infection) but turned off under appropriate circumstances (to avoid autoimmunity), could benefit from such fine control. Better DSB repair, however, may not necessarily be optimal when applied to the immune system. It is possible that down-regulation of H2AFX (possibly by methylation) during immunoglobulin switching helps enhance the diversity of the antibody repertoire and that this balances the potential negative consequences of “accidentally” down-regulating this gene, by methylation, in other tissues. Assessment of the overall methylation status of H2AFX with different SNP2 genotypes in normal and abnormal germinal center B cells would help determine if this is the case.

Other studies have recently found association between nonhomologous end-joining genes, which are important in protection against DSBs, in both NHL (31) and multiple myeloma (32), supporting a role for DSB repair genes in NHL risk and protection.

The AA genotype of H2AFX SNP2, or G(−417)A, is associated with decreased risk of NHL (OR, 0.54), FL (OR, 0.4), and possibly MCL relative to the GG genotype. The risk for the GG genotype relative to the AA genotype can be expressed by the reciprocal ORs for NHL (OR, 1.85) and FL (OR, 2.5). The attributable risk of NHL for individuals who have the GG genotype is ∼46%. The GG genotype of this SNP is very common and is seen in 33% of NHL cases and in 35% of FL cases in this study. Thus, the population attributable risk for this SNP is ∼15% for NHL and 21% for FL. Although this SNP has a modest effect in terms of approximately doubling risk, its high frequency means that it is potentially a very important susceptibility factor for NHL, FL, and potentially MCL. It will be important to test this association in other populations not only to try to replicate this finding but also to verify whether the GG genotype is common in other populations.

Grant support: National Cancer Institute of Canada, Canadian Institutes of Health Research, and Chan Sisters Foundation.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

Acknowledgments

We thank Diana Palmquist for sequence assembly, Dixie Mager and Peter Parham for primate genomic DNAs, Jennifer Roger and Johanna Schuetz for assistance with DNA extractions, and Matthew Bainbridge and Steve Jones for discussions about multispecies alignments.

References

2

Parkin DM, Pisani P, Ferlay J. Estimates of the worldwide incidence of eighteen major cancers in 1985.

Int J Cancer

1993

;

54

:

594

–606.

4

Hjalgrim H, Frisch M, Begtrup K, Melbye M. Recent increase in the incidence of non-Hodgkin's lymphoma among young men and women in Denmark.

Br J Cancer

1996

;

73

:

951

–4.

5

Cartwright RA. Changes in the descriptive epidemiology of non-Hodgkin's lymphoma in Great Britain?

Cancer Res

1992

;

52

:

5441

–2s.

6

Vanasse GJ, Concannon P, Willerford DM. Regulated genomic instability and neoplasia in the lymphoid lineage.

Blood

1999

;

94

:

3997

–4010.

7

Fernandez-Capetillo O, Lee A, Nussenzweig M, Nussenzweig A. H2AX: the histone guardian of the genome.

DNA Repair Amst

2004

;

3

:

959

–67.

8

Rogakou EP, Pilch DR, Orr AH, Ivanova VS, Bonner WM. DNA double-stranded breaks induce histone H2AX phosphorylation on serine 139.

J Biol Chem

1998

;

273

:

5858

–68.

9

Burma S, Chen BP, Murphy M, Kurimasa A, Chen DJ. ATM phosphorylates histone H2AX in response to DNA double-strand breaks.

J Biol Chem

2001

;

276

:

42462

–7.

10

Fernandez-Capetillo O, Chen HT, Celeste A, et al. DNA damage-induced G2-M checkpoint activation by histone H2AX and 53BP1.

Nat Cell Biol

2002

;

4

:

993

–7.

11

Celeste A, Petersen S, Romanienko PJ, et al. Genomic instability in mice lacking histone H2AX.

Science

2002

;

296

:

922

–7.

12

Bassing CH, Suh H, Ferguson DO, et al. Histone H2AX: a dosage-dependent suppressor of oncogenic translocations and tumors.

Cell

2003

;

114

:

359

–70.

13

Celeste A, Difilippantonio S, Difilippantonio MJ, et al. H2AX haploinsufficiency modifies genomic stability and tumor susceptibility.

Cell

2003

;

114

:

371

–83.

14

Brooks-Wilson AR, Kaurah P, Suriano G, et al. Germline E-cadherin mutations in hereditary diffuse gastric cancer: assessment of 42 new families and review of genetic screening criteria.

J Med Genet

2004

;

41

:

508

–17.

15

Barrett JC, Fry B, Maller J, Daly MJ. Haploview: analysis and visualization of LD and haplotype maps.

Bioinformatics

2005

;

21

:

263

–5.

16

Breslow NE, Day NE. Statistical methods in cancer research. Volume I—The analysis of case-control studies.

IARC Sci Publ

1980

;

32

:

5

–338.

17

Burkett K, McNeney B, Graham J. A note on inference of trait associations with SNP haplotypes and other attributes in generalized linear models.

Hum Hered

2004

;

57

:

200

–6.

18

Benjamini Y, Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing.

JR Statist Soc B

1995

;

57

:

289

–300.

19

Nickerson DA, Taylor SL, Weiss KM, et al. DNA sequence diversity in a 9.7-kb region of the human lipoprotein lipase gene.

Nat Genet

1998

;

19

:

233

–40.

20

Rieder MJ, Taylor SL, Clark AG, Nickerson DA. Sequence variation in the human angiotensin converting enzyme.

Nat Genet

1999

;

22

:

59

–62.

21

Johnson GC, Esposito L, Barratt BJ, et al. Haplotype tagging for the identification of common disease genes.

Nat Genet

2001

;

29

:

233

–7.

22

Emigh TH. A comparison of tests for Hardy-Weinberg equilibrium.

Biometrics

1980

;

36

:

627

–42.

23

Takai D, Jones PA. The CpG island searcher: a new WWW resource.

In Silico Biol

2003

;

3

:

235

–40.

24

Doerfler W. DNA methylation and gene activity.

Annu Rev Biochem

1983

;

52

:

93

–124.

25

Bird A. The essentials of DNA methylation.

Cell

1992

;

70

:

5

–8.

26

Duncan BK, Miller JH. Mutagenic deamination of cytosine residues in DNA.

Nature

1980

;

287

:

560

–1.

27

Bird AP, Taggart MH, Nicholls RD, Higgs DR. Non-methylated CpG-rich islands at the human α-globin locus: implications for evolution of the α-globin pseudogene.

Embo J

1987

;

6

:

999

–1004.

28

Zhang Z, Gerstein M. Patterns of nucleotide substitution, insertion and deletion in the human genome inferred from pseudogenes.

Nucleic Acids Res

2003

;

31

:

5338

–48.

29

Rothman N, Skibola CF, Wang SS, et al. Genetic variation in TNF and IL10 and risk of non-Hodgkin lymphoma: a report from the InterLymph Consortium.

Lancet Oncol

2006

;

7

:

27

–38.

30

Altshuler D, Brooks LD, Chakravarti A, Collins FS, Daly MJ, Donnelly P; The International HapMap Consortium. A haplotype map of the human genome.

Nature

2005

;

437

:

1299

–320.

31

Hill DA, Wang SS, Cerhan JR, et al. Risk of non-Hodgkin lymphoma (NHL) in relation to germline variation in DNA repair and related genes.

Blood

2006

;

108

:

3161

–7.

32

Roddam PL, Rollinson S, O'Driscoll M, Jeggo PA, Jack A, Morgan GJ. Genetic variants of NHEJ DNA ligase IV can affect the risk of developing multiple myeloma, a tumour characterised by aberrant class switch recombination.

J Med Genet

2002

;

39

:

900

–5.

American Association for Cancer Research

2007

538 Views

41 Web of Science

32 Crossref

Citing articles via

Email alerts