Mutation history of the roma/gypsies - PubMed (original) (raw)

. 2004 Oct;75(4):596-609.

doi: 10.1086/424759. Epub 2004 Aug 20.

David Gresham, Dora Angelicheva, Ivailo Tournev, Rebecca Gooding, Velina Guergueltcheva, Carolin Schmidt, Angela Abicht, Hanns Lochmuller, Attila Tordai, Lajos Kalmar, Melinda Nagy, Veronika Karcagi, Marc Jeanpierre, Agnes Herczegfalvi, David Beeson, Viswanathan Venkataraman, Kim Warwick Carter, Jeff Reeve, Rosario de Pablo, Vaidutis Kucinskas, Luba Kalaydjieva

Affiliations

Mutation history of the roma/gypsies

Bharti Morar et al. Am J Hum Genet. 2004 Oct.

Abstract

The 8-10 million European Roma/Gypsies are a founder population of common origins that has subsequently split into multiple socially divergent and geographically dispersed Gypsy groups. Unlike other founder populations, whose genealogy has been extensively documented, the demographic history of the Gypsies is not fully understood and, given the lack of written records, has to be inferred from current genetic data. In this study, we have used five disease loci harboring private Gypsy mutations to examine some missing historical parameters and current structure. We analyzed the frequency distribution of the five mutations in 832-1,363 unrelated controls, representing 14 Gypsy populations, and the diversification of chromosomal haplotypes in 501 members of affected families. Sharing of mutations and high carrier rates supported a strong founder effect, and the identity of the congenital myasthenia 1267delG mutation in Gypsy and Indian/Pakistani chromosomes provided the best evidence yet of the Indian origins of the Gypsies. However, dramatic differences in mutation frequencies and haplotype divergence and very limited haplotype sharing pointed to strong internal differentiation and characterized the Gypsies as a founder population comprising multiple subisolates. Using disease haplotype coalescence times at the different loci, we estimated that the entire Gypsy population was founded approximately 32-40 generations ago, with secondary and tertiary founder events occurring approximately 16-25 generations ago. The existence of multiple subisolates, with endogamy maintained to the present day, suggests a general approach to complex disorders in which initial gene mapping could be performed in large families from a single Gypsy group, whereas fine mapping would rely on the informed sampling of the divergent subisolates and searching for the shared genomic region that displays the strongest linkage disequilibrium with the disease.

PubMed Disclaimer

Figures

Figure  1

Figure 1

Assignment of participating subjects to migrational/linguistic categories and individual Gypsy groups within the categories. The Xoroxane and Dassikane (indicated with an asterisk [*]) do not specify individual Gypsy groups but represent broader cultural anthropological divisions within the Balkan category, in which group identity has been lost. The western European migrational category comprises subjects from Hungary, Slovenia, the Czech Republic, Lithuania, Germany, France, Italy, Spain, and Portugal.

Figure  2

Figure 2

Genetic maps used in the haplotype analysis of four of the founder mutations. Markers are shown as equidistant from one another. The position of the mutation is indicated with a black square. Markers designated with an “S” number are known microsatellites found in public databases. Microsatellites identified during positional cloning studies include SLAP (NDRG1 region), 23090ta1, 1908ca1, 21594at1, 68530gt1, 68530ca1, and 68530gt2 (CTDP1 region). LOC125267, LOC125261, DIM1, and PAR6 in this region represent insertion/deletion polymorphisms. Markers in boldface type were included in the comparative analysis of regions spanning ∼3 cM around each disease mutation. A, CHRNE on 17p13.2. B, NDRG1 on 8q24.22. C, SGCG on 13q12.12. _D, CTDP1_on 18qter.

Figure  3

Figure 3

Neighbor-joining tree of Gypsy groups, based on the gene frequency data obtained from the population screening of the five disease mutations and Nei’s standard genetic distance (Nei 1987). The tree was generated using the DISPAN package (Ota 1993), on the basis of the neighbor-joining method (Saitou and Nei 1987). The numbers indicate the robustness of the branches in the tree, assessed with a bootstrap approach.

Figure  4

Figure 4

Networks showing the genealogical relationships of disease haplotypes. The size of the symbols is proportional to the frequency of each haplotype. A, CMS haplotypes in the CHRNE region. B, HMSNL haplotypes in the NDRG1 region. C, HMSNL haplotypes in Gypsy groups from the Vlax migrational category. D, CCFDN haplotypes in the CTDP1 region. E, LGMD2C haplotypes in the SGCG region. For CMS and HMSNL, only haplotype groups comprising more than five unique haplotypes are shown.

Figure  5

Figure 5

LD estimated by pexcess (Hastbacka et al. 1992, 1994) for markers surrounding the founder disease mutations. Markers are shown as equidistant from one another. Intermarker distances are shown in figures 2 and 6. The arrows indicate the approximate location of the disease mutation. A, LD in the CMS gene region in the three migrational categories. B, LD in the HMSNL region in the three migrational categories. C, LD in the HMSNL region in Gypsy groups from the Vlax migrational category. D, LD in the CCFDN region for geographically separated Rudari groups.

Figure  6

Figure 6

Pairwise LD between markers in the HMSNL and CCFDN disease gene regions in four Gypsy groups. A, Rudari. B, Kalderash. C, Lom. D, Turgovzi. Marker loci are listed across the bottom and down the left side. A scale indicating physical distances (in kb) between markers is shown at the bottom. _D_′ values are shown in the upper left-hand triangle, and P values are shown in the lower right-hand triangle. The shading patterns used to illustrate the scale of _D_′ and P values are shown above and at the right-hand side of the top panel. P values <.05 have been corrected for multiple comparisons through use of the Holm-Sidak approach (Lautenberger et al. 2000).

Similar articles

Cited by

References

Electronic-Database Information

    1. National Center for Biotechnology Information Genome View, http://www.ncbi.nlm.nih.gov/mapview/map_search.cgi?taxid=9606 (for the deCODE genetic map)
    1. Online Mendelian Inheritance in Man (OMIM), http://www.ncbi.nlm.nih.gov/Omim/ (for HMSNL, CCFDN, CMS, LGMD2C, and galactokinase deficiency)

References

    1. Abecasis GR, Cookson WO (2000) GOLD—graphical overview of linkage disequilibrium. Bioinformatics 16:182–18310.1093/bioinformatics/16.2.182 - DOI - PubMed
    1. Abicht A, Stucka R, Karcagi V, Herczegfalvi A, Horvath R, Mortier W, Schara U, Ramaekers V, Jost W, Brunner J, Janssen G, Seidel U, Schlotter B, Muller-Felber W, Pongratz D, Rudel R, Lochmuller H (1999) A common mutation (epsilon1267delG) in congenital myasthenic patients of Gypsy ethnic origin. Neurology 53:1564–1569 - PubMed
    1. Agarwala R, Biesecker LG, Tomlin JF, Schaffer AA (1999) Towards a complete North American Anabaptist genealogy: a systematic approach to merging partially overlapping genealogy resources. Am J Med Genet 86:156–16110.1002/(SICI)1096-8628(19990910)86:2<156::AID-AJMG13>3.0.CO;2-5 - DOI - PubMed
    1. Angelicheva D, Turnev I, Dye D, Chandler D, Thomas PK, Kalaydjieva L (1999) Congenital cataracts facial dysmorphism neuropathy (CCFDN) syndrome: a novel developmental disorder in Gypsies maps to 18qter. Eur J Hum Genet 7:560–566 - PubMed
    1. Angius A, Bebbere D, Petretto E, Falchi M, Forabosco P, Maestrale B, Casu G, Persico I, Melis PM, Pirastu M (2002) Not all isolates are equal: linkage disequilibrium analysis on Xq13.3 reveals different patterns in Sardinian sub-populations. Hum Genet 111:9–1510.1007/s00439-002-0753-z - DOI - PubMed

Publication types

MeSH terms

LinkOut - more resources