Genes reveal traces of common recent demographic history for most of the Uralic-speaking populations - PubMed (original) (raw)

doi: 10.1186/s13059-018-1522-1.

Bayazit Yunusbayev 2 3, Georgi Hudjashov 2 4, Anne-Mai Ilumäe 2, Siiri Rootsi 2, Terhi Honkola 5 6, Outi Vesakoski 5, Quentin Atkinson 7 8, Pontus Skoglund 9, Alena Kushniarevich 2 10, Sergey Litvinov 2 11, Maere Reidla 2 12, Ene Metspalu 2, Lehti Saag 2 12, Timo Rantanen 13, Monika Karmin 2, Jüri Parik 2 12, Sergey I Zhadanov 2 14, Marina Gubina 2 15, Larisa D Damba 2 16, Marina Bermisheva 2 11, Tuuli Reisberg 2, Khadizhat Dibirova 2 17, Irina Evseeva 18 19, Mari Nelis 20, Janis Klovins 21, Andres Metspalu 20, Tõnu Esko 20, Oleg Balanovsky 17 22, Elena Balanovska 17, Elza K Khusnutdinova 11 23, Ludmila P Osipova 15 24, Mikhail Voevoda 15 24 25, Richard Villems 2 12, Toomas Kivisild 2 12 26 27, Mait Metspalu 2

Affiliations

Genes reveal traces of common recent demographic history for most of the Uralic-speaking populations

Kristiina Tambets et al. Genome Biol. 2018.

Abstract

Background: The genetic origins of Uralic speakers from across a vast territory in the temperate zone of North Eurasia have remained elusive. Previous studies have shown contrasting proportions of Eastern and Western Eurasian ancestry in their mitochondrial and Y chromosomal gene pools. While the maternal lineages reflect by and large the geographic background of a given Uralic-speaking population, the frequency of Y chromosomes of Eastern Eurasian origin is distinctively high among European Uralic speakers. The autosomal variation of Uralic speakers, however, has not yet been studied comprehensively.

Results: Here, we present a genome-wide analysis of 15 Uralic-speaking populations which cover all main groups of the linguistic family. We show that contemporary Uralic speakers are genetically very similar to their local geographical neighbours. However, when studying relationships among geographically distant populations, we find that most of the Uralic speakers and some of their neighbours share a genetic component of possibly Siberian origin. Additionally, we show that most Uralic speakers share significantly more genomic segments identity-by-descent with each other than with geographically equidistant speakers of other languages. We find that correlated genome-wide genetic and lexical distances among Uralic speakers suggest co-dispersion of genes and languages. Yet, we do not find long-range genetic ties between Estonians and Hungarians with their linguistic sisters that would distinguish them from their non-Uralic-speaking neighbours.

Conclusions: We show that most Uralic speakers share a distinct ancestry component of likely Siberian origin, which suggests that the spread of Uralic languages involved at least some demic component.

Keywords: Genome-wide analysis; Haplotype analysis; IBD-segments; Population genetics; Uralic languages.

PubMed Disclaimer

Conflict of interest statement

DNA samples were obtained from unrelated volunteers, all donors provided informed consent and all experiments were performed in accordance with the relevant guidelines and regulations of the involved countries. The research has been approved by the Research Ethics Commitees of the University of Tartu and the Russian Academy of Sciences (approval nos. 228/M-40, 252/M-17, 17146-9217). Experimental methods of the study comply with the Helsinki Declaration.

Not applicable

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

Fig. 1

Fig. 1

Geographic distribution of the Uralic-speaking populations and the schematic tree of the Uralic languages. a The geographic spread of the Uralic-speaking populations. Colour coding corresponds to the respective language in panel b. b Schematic representation of the phylogeny of the Uralic languages. Pie diagrams indicate the relative share of West and East Eurasian mitochondrial (mtDNA) and Y chromosomal (Y) lineages. Data from Additional file 5: Table S4 and Additional file 6: Table S5

Fig. 2

Fig. 2

Principal component analysis (PCA) and genetic distances of Uralic-speaking populations. a PCA (PC1 vs PC2) of the Uralic-speaking populations (highlighted, population abbreviations are as in Additional file 1: Table S1). Values in brackets along the axes indicate the proportion of genetic variation explained by the components. b UPGMA tree of _F_ST distances calculated based on autosomal genetic variation

Fig. 3

Fig. 3

Population structure of Uralic-speaking populations inferred from ADMIXTURE analysis on autosomal SNPs in Eurasian context. a Individual ancestry estimates for populations of interest for selected number of assumed ancestral populations (K3, K6, K9, K11). Ancestry components discussed in a main text (k2, k3, k5, k6, k9, k11) are indicated and have the same colours throughout. The names of the Uralic-speaking populations are indicated with blue (Finno-Ugric) or orange (Samoyedic). The full bar plot is presented in Additional file 3: Figure S3. b Frequency map of component k9

Fig. 4

Fig. 4

Share of ~ 1–2 cM identity-by-descent (IBD) segments within and between regional groups of Uralic speakers. For each Uralic-speaking population representing lines in this matrix, we performed permutation test to estimate if it shows higher IBD segment sharing with other population (listed in columns) as compared to their geographic control group. Empty rectangles indicate no excess IBD sharing, rectangles filled in blue indicate comparisons when statistically significant excess IBD sharing was detected between one Uralic-speaking population with another Uralic-speaking population (listed in columns), rectangles filled in green mark the comparisons when a Uralic-speaking population shows excess IBD sharing with a non-Uralic-speaking population. For each tested Uralic speaker (matrix rows) populations in the control group that were used to generate permuted samples are indicated using small circles. For example, the rectangle filled in blue for Vepsians and Komis (A) implies that the Uralic-speaking Vepsians share more IBD segments with the Uralic-speaking Komis than the geographic control group for Vepsians, i.e. populations indicated with small circles (Central and North Russians, Swedes, Latvians and Lithuanians). The rectangle filled in green for Vepsians and Dolgans shows that the Uralic-speaking Vepsians share more IBD segments with the non-Uralic-speaking Dolgans than the geographic control group

Fig. 5

Fig. 5

Circos plots of GLOBETROTTER (GT) results. The outer circle represents target groups for which GT inference was performed (wide segments) and additional surrogate populations, which were used to describe admixture in target populations (narrow segments). Geographic affiliation of target groups is colour-coded: blue—Europe (except populations from Volga-Ural region—Komis, Udmurts, Maris, Tatars, Chuvashes, Bashkirs); green—Volga-Ural region; and magenta—Western Siberia. Inner bar plots depict genetic composition of inferred sources of admixture in each of the target groups. A pair of sources is shown for a simple one-way admixture event between two populations, and an additional pair of sources for the less strongly signaled event is shown for a one-date multi-way admixture between more than two sources (marked as MW in the outer circle). In a simple one-date event, a pair of sources contributes 100% of the DNA of the target population. Surrogate populations in the inner bar plots are shaded according to the colour scheme given in the outer ring, and those contributing < 3% to mixing sources are coloured in grey. Point estimates and confidence intervals for the date of inferred admixture event are shown next to the cluster label. The details of the GT source groups are given in Additional file 3: Figure S5 and Additional file 11: Table S10. a Results of ‘full’ analysis, where each cluster was allowed to copy from every other cluster. b Results of ‘regional’ analysis, where no copying between samples from the same geographical region was allowed. For example, in the ‘full’ analysis of the ‘Europe 1’ cluster, a simple one-date admixture event was detected. The first source population contributes 85% of the total DNA, including 76% from the ‘Europe 2’ surrogate; the second source contributes 15% and is dominated by the ‘Finnic’ cluster. The admixture took place around 1211 CE (95% CI: 1213–1412 CE). Abbreviations: C-Central; Cauc-Caucasus; E-East; N-North; S-South; Sib-Siberia; W-West.

Fig. 6

Fig. 6

Proportions of ancestral components in studied European and Siberian populations and the tested qpGraph model. a The qpGraph model fitting the data for the tested populations. Colour codes for the terminal nodes: pink—modern populations (‘Population X’ refers to test population) and yellow—ancient populations (aDNA samples and their pools). Nodes coloured other than pink or yellow are hypothetical intermediate populations. We putatively named nodes which we used as admixture sources using the main recipient among known populations. The colours of intermediate nodes on the qpGraph model match those on the admixture proportions panel. b Admixture proportions (%) of ancestral components. We calculated the admixture proportions summing up the relative shares of a set of intermediate populations to explain the full spectrum of admixture components in the test population. We further did the same for the intermediate node CWC’ and present the proportions of the mixing three components in the stacked column bar of CWC’. Colour codes for ancestral components are as follows: dark green—Western hunter gatherer (WHG’); light green—Eastern hunter gatherer (EHG’); grey—European early farmer (LBK’); dark blue—carriers of Corded Ware culture (CWC’); and dark grey—Siberian. CWC’ consists of three sub-components: blue—Caucasian hunter-gatherer in Yamnaya (CHGinY’); light blue—Eastern hunter-gatherer in Yamnaya (EHGinY’); and light grey—Neolithic Levant (NeolL’)

Similar articles

Cited by

References

    1. Yunusbayev B, Metspalu M, Metspalu E, Valeev A, Litvinov S, et al. The genetic legacy of the expansion of Turkic-speaking nomads across Eurasia. PLoS Genet. 2015;11:e1005068. doi: 10.1371/journal.pgen.1005068. - DOI - PMC - PubMed
    1. Haak W, Lazaridis I, Patterson N, Rohland N, Mallick S, et al. Massive migration from the steppe was a source for Indo-European languages in Europe. Nature. 2015; 10.1038/nature14317 - PMC - PubMed
    1. Allentoft ME, Sikora M, Sjögren K-G, Rasmussen S, Rasmussen M, et al. Population genomics of Bronze Age Eurasia. Nature. 2015;522:167–172. doi: 10.1038/nature14507. - DOI - PubMed
    1. Indreko R. Origin and area of settlement of the Fenno-Ugrian peoples. Science in Exile. Publication of the Scientific Quarterly “Scholar”. Heidelberg: Heidelberger Gutenberg-Druckerei GmbH; 1948. pp. 3–24.
    1. Setälä N. E (1926) Johdanto. In: Kannisto A, editor. Suomen suku I. Helsinki: Kustannusosakeyhtiö Otava.

Publication types

MeSH terms

Grants and funding

LinkOut - more resources