A high-resolution HLA and SNP haplotype map for disease association studies in the extended human MHC - PubMed (original) (raw)
. 2006 Oct;38(10):1166-72.
doi: 10.1038/ng1885. Epub 2006 Sep 24.
Gil McVean, Pardis C Sabeti, Marcos M Miretti, Todd Green, Jonathan Marchini, Xiayi Ke, Alienke J Monsuur, Pamela Whittaker, Marcos Delgado, Jonathan Morrison, Angela Richardson, Emily C Walsh, Xiaojiang Gao, Luana Galver, John Hart, David A Hafler, Margaret Pericak-Vance, John A Todd, Mark J Daly, John Trowsdale, Cisca Wijmenga, Tim J Vyse, Stephan Beck, Sarah Shaw Murray, Mary Carrington, Simon Gregory, Panos Deloukas, John D Rioux
Affiliations
- PMID: 16998491
- PMCID: PMC2670196
- DOI: 10.1038/ng1885
A high-resolution HLA and SNP haplotype map for disease association studies in the extended human MHC
Paul I W de Bakker et al. Nat Genet. 2006 Oct.
Abstract
The proteins encoded by the classical HLA class I and class II genes in the major histocompatibility complex (MHC) are highly polymorphic and are essential in self versus non-self immune recognition. HLA variation is a crucial determinant of transplant rejection and susceptibility to a large number of infectious and autoimmune diseases. Yet identification of causal variants is problematic owing to linkage disequilibrium that extends across multiple HLA and non-HLA genes in the MHC. We therefore set out to characterize the linkage disequilibrium patterns between the highly polymorphic HLA genes and background variation by typing the classical HLA genes and >7,500 common SNPs and deletion-insertion polymorphisms across four population samples. The analysis provides informative tag SNPs that capture much of the common variation in the MHC region and that could be used in disease association studies, and it provides new insight into the evolutionary dynamics and ancestral origins of the HLA loci and their haplotypes.
Figures
Figure 1
The relationship between recombination rates and haplotype structure spanning the 7.5 Mb extended MHC region (defined by the SLC17A2 gene at the telomeric end to the DAXX gene at the centromeric end of chromosome 6). Recombination rates (blue lines in cM/Mb) were estimated separately from each population and combined to provide a single estimate for the region. Recombination hotspots with strong statistical support in all three (in red) or two analysis panels (in pink) are indicated by the red triangles and vertical grey lines. The average recombination rate across the region is 0.44 cM/Mb, compared to a genome-wide average of 1.2 cM/Mb, and is particularly low in the 3 Mb region that includes the olfactory receptor gene cluster which has only two hotspots with strong statistical evidence. The horizontal lines indicate the extent of non-redundant haplotypes (see text for details) identified in each analysis panel (YRI: green, CEU: orange and CHB+JPT: purple). Haplotypes are typically longer in regions of low recombination, and are often, though not always, interrupted by recombination hotspots. Haplotypes are typically longer in the CEU and CHB+JPT analysis panels than in YRI. The physical locations of the six classical HLA loci analyzed in this study are also shown.
Figure 2
Allelic association between SNPs across the 7.5 Mb extended MHC region and HLA types at each gene for the combined population data using the 5,754 SNPs that were typed in all populations and are polymorphic across the combined population samples (see Methods for details). (a) The extent of association between SNPs across the region and HLA types at HLA-A (red), HLA-C (light green), HLA-B (dark green), HLA-DRB (blue), HLA-DQA (violet), HLA-DQB (purple) as measured by relative information in the combined population data. The significant information contained within these SNPs located outside the HLA genes is not surprising given the extensive LD between SNPs and HLA loci. LD extended up to 1 Mb from the centre of a given HLA gene and, as a consequence, a single SNP could be informative for more than a single HLA gene. (b) For HLA-C (the position of which is indicated by the vertical blue line), the position of SNPs across the 7.5 Mb region showing weak (0.2 < _r_2 <0.5; grey), moderate (0.5 < _r_2 < 0.8; blue) and strong (_r_2 > 0.8; red) association to each type that is present in each of the four populations. The size of the adjacent green bar indicates the relative frequency of each type in each population (types not present in a population are not shown).
Figure 3
The evolutionary history of HLA-C. (a) Estimated evolutionary tree showing relationships among haplotypes at the HLA-C locus (defined as position 31,341,277 in build 34 or between SNPs rs2853950 and rs2001181) with mutations (blue circles) that unambiguously determine clades in the tree (see Methods for details). Below is a plot of the 478 haplotypes observed in the 100 kb region surrounding HLA-C (each column is a single haplotype) with the less common allele shown in the darker color. Colors indicate the HLA-C allele carried on each haplotype for the six most common alleles (each seen at least 30 times in the combined populations), with the position of HLA-C indicated by the arrow. The colored bar below indicates the population origin of each haplotype (YRI: green, CEU: orange and CHB+JPT: purple). Some alleles such as HLA-C*0702 (green) cluster within the tree whereas others such as HLA-C*0701 (yellow) occur in two or more parts of the tree. Furthermore, the two clades representing HLA-C*0701 are at different frequencies in the four populations. (b) Long-range haplotype structure around alleles C*0702 and C*0701. For C*0702, the common long-range haplotype is shared among the 2 populations, CEU and YRI, and is accordingly associated with a single clade. In contrast, for C*0701, the long-range haplotypes that are common in CEU and YRI are divergent. A shared haplotype structure nearby the HLA allele suggests that allele had a common origin in the 2 populations. A recombination event, however, likely occurred in at least one of the populations, placing them in 2 different clades.
Figure 3
The evolutionary history of HLA-C. (a) Estimated evolutionary tree showing relationships among haplotypes at the HLA-C locus (defined as position 31,341,277 in build 34 or between SNPs rs2853950 and rs2001181) with mutations (blue circles) that unambiguously determine clades in the tree (see Methods for details). Below is a plot of the 478 haplotypes observed in the 100 kb region surrounding HLA-C (each column is a single haplotype) with the less common allele shown in the darker color. Colors indicate the HLA-C allele carried on each haplotype for the six most common alleles (each seen at least 30 times in the combined populations), with the position of HLA-C indicated by the arrow. The colored bar below indicates the population origin of each haplotype (YRI: green, CEU: orange and CHB+JPT: purple). Some alleles such as HLA-C*0702 (green) cluster within the tree whereas others such as HLA-C*0701 (yellow) occur in two or more parts of the tree. Furthermore, the two clades representing HLA-C*0701 are at different frequencies in the four populations. (b) Long-range haplotype structure around alleles C*0702 and C*0701. For C*0702, the common long-range haplotype is shared among the 2 populations, CEU and YRI, and is accordingly associated with a single clade. In contrast, for C*0701, the long-range haplotypes that are common in CEU and YRI are divergent. A shared haplotype structure nearby the HLA allele suggests that allele had a common origin in the 2 populations. A recombination event, however, likely occurred in at least one of the populations, placing them in 2 different clades.
Figure 4
The genetic distance over which the long-range haplotype associated with each allele for each SNP on chromosome 6 extends (before decaying to an EHH of 0.8) in each of the four populations. (See Methods for details.) The blue dot represents the average extent of long-range haplotypes for SNP alleles in 20 different frequency bins (0%-5% 5%-10%, etc...), with the 95% confidence interval represented by a black line. HLA alleles above the 95% confidence interval are presented by red diamonds.
Similar articles
- HLA and SNP haplotype mapping in the Japanese population.
Kitajima H, Sonoda M, Yamamoto K. Kitajima H, et al. Genes Immun. 2012 Oct;13(7):543-8. doi: 10.1038/gene.2012.35. Epub 2012 Aug 23. Genes Immun. 2012. PMID: 22914434 - Genetic fixity in the human major histocompatibility complex and block size diversity in the class I region including HLA-E.
Romero V, Larsen CE, Duke-Cohan JS, Fox EA, Romero T, Clavijo OP, Fici DA, Husain Z, Almeciga I, Alford DR, Awdeh ZL, Zuñiga J, El-Dahdah L, Alper CA, Yunis EJ. Romero V, et al. BMC Genet. 2007 Apr 12;8:14. doi: 10.1186/1471-2156-8-14. BMC Genet. 2007. PMID: 17430593 Free PMC article. - Imputing amino acid polymorphisms in human leukocyte antigens.
Jia X, Han B, Onengut-Gumuscu S, Chen WM, Concannon PJ, Rich SS, Raychaudhuri S, de Bakker PI. Jia X, et al. PLoS One. 2013 Jun 6;8(6):e64683. doi: 10.1371/journal.pone.0064683. Print 2013. PLoS One. 2013. PMID: 23762245 Free PMC article. - HLA Genetics for the Human Diseases.
Shiina T, Kulski JK. Shiina T, et al. Adv Exp Med Biol. 2024;1444:237-258. doi: 10.1007/978-981-99-9781-7_16. Adv Exp Med Biol. 2024. PMID: 38467984 Review. - Genetics of diabetes. Trans-racial gene mapping studies.
Mijovic CH, Barnett AH, Todd JA. Mijovic CH, et al. Baillieres Clin Endocrinol Metab. 1991 Jun;5(2):321-40. doi: 10.1016/s0950-351x(05)80130-2. Baillieres Clin Endocrinol Metab. 1991. PMID: 1892469 Review.
Cited by
- Inference of Host-Pathogen Interaction Matrices from Genome-Wide Polymorphism Data.
Märkle H, John S, Metzger L; STOP-HCV Consortium; Ansari MA, Pedergnana V, Tellier A. Märkle H, et al. Mol Biol Evol. 2024 Sep 4;41(9):msae176. doi: 10.1093/molbev/msae176. Mol Biol Evol. 2024. PMID: 39172738 Free PMC article. - Cardiovascular Autonomic Deficits in Different Types of Achalasia.
Anil A, Netam RK, Roy A, Chandran DS, Jaryal AK, Makharia GK, Parshad R, Deepak KK. Anil A, et al. Cureus. 2024 May 1;16(5):e59444. doi: 10.7759/cureus.59444. eCollection 2024 May. Cureus. 2024. PMID: 38826939 Free PMC article. - Disentangling the heterogeneity of multiple sclerosis through identification of independent neuropathological dimensions.
de Boer A, van den Bosch AMR, Mekkes NJ, Fransen NL, Dagkesamanskaia E, Hoekstra E, Hamann J, Smolders J, Huitinga I, Holtman IR. de Boer A, et al. Acta Neuropathol. 2024 May 21;147(1):90. doi: 10.1007/s00401-024-02742-w. Acta Neuropathol. 2024. PMID: 38771530 Free PMC article. - Fine mapping identifies independent HLA associations in autoimmune hepatitis type 1.
Li Y, Zhou L, Huang Z, Yang Y, Zhang J, Yang L, Xu Y, Shi J, Tang S, Yuan X, Xu J, Li Y, Han X, Li J, Liu Y, Sun Y, Jin X, Xiao X, Wang B, Lin Q, Zhou Y, Song X, Cui Y, Hu L, Song Y, Bao J, Gong L, Gershwin ME, Zuo X, Yan H, Zou Z, Tang R, Ma X; Chinese AIH Consortium. Li Y, et al. JHEP Rep. 2023 Oct 5;6(1):100926. doi: 10.1016/j.jhepr.2023.100926. eCollection 2024 Jan. JHEP Rep. 2023. PMID: 38089552 Free PMC article. - Phenome-wide association study on miRNA-related sequence variants: the UK Biobank.
Mustafa R, Ghanbari M, Karhunen V, Evangelou M, Dehghan A. Mustafa R, et al. Hum Genomics. 2023 Nov 24;17(1):104. doi: 10.1186/s40246-023-00553-w. Hum Genomics. 2023. PMID: 37996941 Free PMC article.
References
- Dupont B, Svejgaard A. HLA and disease. Transplant Proc. 1977;9:1271–4. - PubMed
- Allcock RJ, et al. The MHC haplotype project: a resource for HLA-linked association studies. Tissue Antigens. 2002;59:520–1. - PubMed
- Horton R, et al. Gene map of the extended human MHC. Nat Rev Genet. 2004;5:889–99. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
- G9800943/MRC_/Medical Research Council/United Kingdom
- 077011/WT_/Wellcome Trust/United Kingdom
- U19 AI050864/AI/NIAID NIH HHS/United States
- ImNIH/Intramural NIH HHS/United States
- WT_/Wellcome Trust/United Kingdom
- N01CO12400/CA/NCI NIH HHS/United States
- N01-CO-12400/CO/NCI NIH HHS/United States
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
Research Materials