The portability of tagSNPs across populations: A worldwide survey (original) (raw)

  1. Anna González-Neira1,5,
  2. Xiayi Ke2,
  3. Oscar Lao1,
  4. Francesc Calafell1,
  5. Arcadi Navarro1,
  6. David Comas1,
  7. Howard Cann3,
  8. Suzannah Bumpstead4,
  9. Jilur Ghori4,
  10. Sarah Hunt4,
  11. Panos Deloukas4,
  12. Ian Dunham4,
  13. Lon R. Cardon2, and
  14. Jaume Bertranpetit1,6
  15. 1 Unitat de Biologia Evolutiva, Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
  16. 2 Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, OX3 7BN, United Kingdom
  17. 3 Fondation Jean-Dausset, Centre d'Étude du Polymorphisme Humain (CEPH), 75010 Paris, France
  18. 4 The Sanger Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1HH, United Kingdom

Abstract

In the search for common genetic variants that contribute to prevalent human diseases, patterns of linkage disequilibrium (LD) among linked markers should be considered when selecting SNPs. Genotyping efficiency can be increased by choosing tagging SNPs (tagSNPs) in LD with other SNPs. However, it remains to be seen whether tagSNPs defined in one population efficiently capture LD in other populations; that is, how portable tagSNPs are. Indeed, tagSNP portability is a challenge for the applicability of HapMap results. We analyzed 144 SNPs in a 1-Mb region of chromosome 22 in 1055 individuals from 38 worldwide populations, classified into seven continental groups. We measured tagSNP portability by choosing three reference populations (to approximate the three HapMap populations), defining tagSNPs, and applying them to other populations independently on the availability of information on the tagSNPs in the compared population. We found that tagSNPs are highly informative in other populations within each continental group. Moreover, tagSNPs defined in Europeans are often efficient for Middle Eastern and Central/South Asian populations. TagSNPs defined in the three reference populations are also efficient for more distant and differentiated populations (Oceania, Americas), in which the impact of their special demographic history on the genetic structure does not interfere with successfully detecting the most common haplotype variation. This high degree of portability lends promise to the search for disease association in different populations, once tagSNPs are defined in a few reference populations like those analyzed in the HapMap initiative.

Footnotes