A high-resolution HLA and SNP haplotype map for disease association studies in the extended human MHC - PubMed (original) (raw)

. 2006 Oct;38(10):1166-72.

doi: 10.1038/ng1885. Epub 2006 Sep 24.

Gil McVean, Pardis C Sabeti, Marcos M Miretti, Todd Green, Jonathan Marchini, Xiayi Ke, Alienke J Monsuur, Pamela Whittaker, Marcos Delgado, Jonathan Morrison, Angela Richardson, Emily C Walsh, Xiaojiang Gao, Luana Galver, John Hart, David A Hafler, Margaret Pericak-Vance, John A Todd, Mark J Daly, John Trowsdale, Cisca Wijmenga, Tim J Vyse, Stephan Beck, Sarah Shaw Murray, Mary Carrington, Simon Gregory, Panos Deloukas, John D Rioux

Affiliations

A high-resolution HLA and SNP haplotype map for disease association studies in the extended human MHC

Paul I W de Bakker et al. Nat Genet. 2006 Oct.

Abstract

The proteins encoded by the classical HLA class I and class II genes in the major histocompatibility complex (MHC) are highly polymorphic and are essential in self versus non-self immune recognition. HLA variation is a crucial determinant of transplant rejection and susceptibility to a large number of infectious and autoimmune diseases. Yet identification of causal variants is problematic owing to linkage disequilibrium that extends across multiple HLA and non-HLA genes in the MHC. We therefore set out to characterize the linkage disequilibrium patterns between the highly polymorphic HLA genes and background variation by typing the classical HLA genes and >7,500 common SNPs and deletion-insertion polymorphisms across four population samples. The analysis provides informative tag SNPs that capture much of the common variation in the MHC region and that could be used in disease association studies, and it provides new insight into the evolutionary dynamics and ancestral origins of the HLA loci and their haplotypes.

PubMed Disclaimer

Figures

Figure 1

Figure 1

The relationship between recombination rates and haplotype structure spanning the 7.5 Mb extended MHC region (defined by the SLC17A2 gene at the telomeric end to the DAXX gene at the centromeric end of chromosome 6). Recombination rates (blue lines in cM/Mb) were estimated separately from each population and combined to provide a single estimate for the region. Recombination hotspots with strong statistical support in all three (in red) or two analysis panels (in pink) are indicated by the red triangles and vertical grey lines. The average recombination rate across the region is 0.44 cM/Mb, compared to a genome-wide average of 1.2 cM/Mb, and is particularly low in the 3 Mb region that includes the olfactory receptor gene cluster which has only two hotspots with strong statistical evidence. The horizontal lines indicate the extent of non-redundant haplotypes (see text for details) identified in each analysis panel (YRI: green, CEU: orange and CHB+JPT: purple). Haplotypes are typically longer in regions of low recombination, and are often, though not always, interrupted by recombination hotspots. Haplotypes are typically longer in the CEU and CHB+JPT analysis panels than in YRI. The physical locations of the six classical HLA loci analyzed in this study are also shown.

Figure 2

Figure 2

Allelic association between SNPs across the 7.5 Mb extended MHC region and HLA types at each gene for the combined population data using the 5,754 SNPs that were typed in all populations and are polymorphic across the combined population samples (see Methods for details). (a) The extent of association between SNPs across the region and HLA types at HLA-A (red), HLA-C (light green), HLA-B (dark green), HLA-DRB (blue), HLA-DQA (violet), HLA-DQB (purple) as measured by relative information in the combined population data. The significant information contained within these SNPs located outside the HLA genes is not surprising given the extensive LD between SNPs and HLA loci. LD extended up to 1 Mb from the centre of a given HLA gene and, as a consequence, a single SNP could be informative for more than a single HLA gene. (b) For HLA-C (the position of which is indicated by the vertical blue line), the position of SNPs across the 7.5 Mb region showing weak (0.2 < _r_2 <0.5; grey), moderate (0.5 < _r_2 < 0.8; blue) and strong (_r_2 > 0.8; red) association to each type that is present in each of the four populations. The size of the adjacent green bar indicates the relative frequency of each type in each population (types not present in a population are not shown).

Figure 3

Figure 3

The evolutionary history of HLA-C. (a) Estimated evolutionary tree showing relationships among haplotypes at the HLA-C locus (defined as position 31,341,277 in build 34 or between SNPs rs2853950 and rs2001181) with mutations (blue circles) that unambiguously determine clades in the tree (see Methods for details). Below is a plot of the 478 haplotypes observed in the 100 kb region surrounding HLA-C (each column is a single haplotype) with the less common allele shown in the darker color. Colors indicate the HLA-C allele carried on each haplotype for the six most common alleles (each seen at least 30 times in the combined populations), with the position of HLA-C indicated by the arrow. The colored bar below indicates the population origin of each haplotype (YRI: green, CEU: orange and CHB+JPT: purple). Some alleles such as HLA-C*0702 (green) cluster within the tree whereas others such as HLA-C*0701 (yellow) occur in two or more parts of the tree. Furthermore, the two clades representing HLA-C*0701 are at different frequencies in the four populations. (b) Long-range haplotype structure around alleles C*0702 and C*0701. For C*0702, the common long-range haplotype is shared among the 2 populations, CEU and YRI, and is accordingly associated with a single clade. In contrast, for C*0701, the long-range haplotypes that are common in CEU and YRI are divergent. A shared haplotype structure nearby the HLA allele suggests that allele had a common origin in the 2 populations. A recombination event, however, likely occurred in at least one of the populations, placing them in 2 different clades.

Figure 3

Figure 3

The evolutionary history of HLA-C. (a) Estimated evolutionary tree showing relationships among haplotypes at the HLA-C locus (defined as position 31,341,277 in build 34 or between SNPs rs2853950 and rs2001181) with mutations (blue circles) that unambiguously determine clades in the tree (see Methods for details). Below is a plot of the 478 haplotypes observed in the 100 kb region surrounding HLA-C (each column is a single haplotype) with the less common allele shown in the darker color. Colors indicate the HLA-C allele carried on each haplotype for the six most common alleles (each seen at least 30 times in the combined populations), with the position of HLA-C indicated by the arrow. The colored bar below indicates the population origin of each haplotype (YRI: green, CEU: orange and CHB+JPT: purple). Some alleles such as HLA-C*0702 (green) cluster within the tree whereas others such as HLA-C*0701 (yellow) occur in two or more parts of the tree. Furthermore, the two clades representing HLA-C*0701 are at different frequencies in the four populations. (b) Long-range haplotype structure around alleles C*0702 and C*0701. For C*0702, the common long-range haplotype is shared among the 2 populations, CEU and YRI, and is accordingly associated with a single clade. In contrast, for C*0701, the long-range haplotypes that are common in CEU and YRI are divergent. A shared haplotype structure nearby the HLA allele suggests that allele had a common origin in the 2 populations. A recombination event, however, likely occurred in at least one of the populations, placing them in 2 different clades.

Figure 4

Figure 4

The genetic distance over which the long-range haplotype associated with each allele for each SNP on chromosome 6 extends (before decaying to an EHH of 0.8) in each of the four populations. (See Methods for details.) The blue dot represents the average extent of long-range haplotypes for SNP alleles in 20 different frequency bins (0%-5% 5%-10%, etc...), with the 95% confidence interval represented by a black line. HLA alleles above the 95% confidence interval are presented by red diamonds.

Similar articles

Cited by

References

    1. Dupont B, Svejgaard A. HLA and disease. Transplant Proc. 1977;9:1271–4. - PubMed
    1. Miretti MM, et al. A high-resolution linkage-disequilibrium map of the human major histocompatibility complex and first generation of tag single-nucleotide polymorphisms. Am J Hum Genet. 2005;76:634–46. - PMC - PubMed
    1. Walsh EC, et al. An integrated haplotype map of the human major histocompatibility complex. Am J Hum Genet. 2003;73:580–90. - PMC - PubMed
    1. Allcock RJ, et al. The MHC haplotype project: a resource for HLA-linked association studies. Tissue Antigens. 2002;59:520–1. - PubMed
    1. Horton R, et al. Gene map of the extended human MHC. Nat Rev Genet. 2004;5:889–99. - PubMed

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources