Complete MHC Haplotype Sequencing for Common Disease Gene Mapping (original) (raw)

  1. C. Andrew Stewart2,7,
  2. Roger Horton1,7,
  3. Richard J.N. Allcock2,8,
  4. Jennifer L. Ashurst1,
  5. Alexey M. Atrazhev3,
  6. Penny Coggill1,
  7. Ian Dunham1,
  8. Simon Forbes1,2,
  9. Karen Halls1,
  10. Joanna M.M. Howson5,
  11. Sean J. Humphray1,
  12. Sarah Hunt1,
  13. Andrew J. Mungall1,
  14. Kazutoyo Osoegawa4,
  15. Sophie Palmer1,
  16. Anne N. Roberts5,
  17. Jane Rogers1,
  18. Sarah Sims1,
  19. Yu Wang4,
  20. Laurens G. Wilming1,
  21. John F. Elliott3,
  22. Pieter J. de Jong4,
  23. Stephen Sawcer6,
  24. John A. Todd5,
  25. John Trowsdale2, and
  26. Stephan Beck1,9
  27. 1 Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom
  28. 2 Department of Pathology, Immunology Division, University of Cambridge, Cambridge CB2 1QP, United Kingdom
  29. 3 Department of Medical Microbiology and Immunology, University of Alberta, Edmonton, AB T6G 2S2, Canada
  30. 4 Children's Hospital Oakland Research Institute, Oakland, California 94609-1673, USA
  31. 5 JDRF/WT Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Addenbrooke's Hospital, Cambridge CB2 2XY, United Kingdom
  32. 6 University of Cambridge, Neurology Unit, Addenbrooke's Hospital, Cambridge, CB2 2QQ, United Kingdom

Abstract

The future systematic mapping of variants that confer susceptibility to common diseases requires the construction of a fully informative polymorphism map. Ideally, every base pair of the genome would be sequenced in many individuals. Here, we report 4.75 Mb of contiguous sequence for each of two common haplotypes of the major histocompatibility complex (MHC), to which susceptibility to >100 diseases has been mapped. The autoimmune disease-associated-haplotypes HLA-A3-B7-Cw7-DR15 and HLA-A1-B8-Cw7-DR3 were sequenced in their entirety through a bacterial artificial chromosome (BAC) cloning strategy using the consanguineous cell lines PGF and COX, respectively. The two sequences were annotated to encompass all described splice variants of expressed genes. We defined the complete variation content of the two haplotypes, revealing >18,000 variations between them. Average SNP densities ranged from less than one SNP per kilobase to >60. Acquisition of complete and accurate sequence data over polymorphic regions such as the MHC from large-insert cloned DNA provides a definitive resource for the construction of informative genetic maps, and avoids the limitation of chromosome regions that are refractory to PCR amplification.

Footnotes