Phylogeographic and genome-wide investigations of Vietnam ethnic groups reveal signatures of complex historical demographic movements - PubMed (original) (raw)

doi: 10.1038/s41598-017-12813-6.

S Pischedda 1 2 3 4, R Barral-Arca 1 2 3 4, A Gómez-Carballa 1 2 3 4, J Pardo-Seco 1 2 3 4, V Álvarez-Iglesias 1 2, J M Cárdenas 1 2 6, N D Nguyen 7, H H Ha 7, A T Le 7, F Martinón-Torres 3 4, C Vullo 5, A Salas 8 9

Affiliations

Phylogeographic and genome-wide investigations of Vietnam ethnic groups reveal signatures of complex historical demographic movements

S Pischedda et al. Sci Rep. 2017.

Abstract

The territory of present-day Vietnam was the cradle of one of the world's earliest civilizations, and one of the first world regions to develop agriculture. We analyzed the mitochondrial DNA (mtDNA) complete control region of six ethnic groups and the mitogenomes from Vietnamese in The 1000 Genomes Project (1000G). Genome-wide data from 1000G (~55k SNPs) were also investigated to explore different demographic scenarios. All Vietnamese carry South East Asian (SEA) haplotypes, which show a moderate geographic and ethnic stratification, with the Mong constituting the most distinctive group. Two new mtDNA clades (M7b1a1f1 and F1f1) point to historical gene flow between the Vietnamese and other neighboring countries. Bayesian-based inferences indicate a time-deep and continuous population growth of Vietnamese, although with some exceptions. The dramatic population decrease experienced by the Cham 700 years ago (ya) fits well with the Nam tiến ("southern expansion") southwards from their original heartland in the Red River Delta. Autosomal SNPs consistently point to important historical gene flow within mainland SEA, and add support to a main admixture event occurring between Chinese and a southern Asian ancestral composite (mainly represented by the Malay). This admixture event occurred ~800 ya, again coinciding with the Nam tiến.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Figure 1

Figure 1

(A) Frequencies of main haplogroup and sub-haplogroups by ethnic groups. (B) Map showing the location of the main Vietnamese regions analyzed in the present study. The pie charts display the frequency values for the main haplogroup categories. Maps were generated using R Project for Statistical Computing v. 3.3.1 (

https://www.r-project.org/

) and the package autoMap v. 1.0–14 (

https://cran.r-project.org/web/packages/automap/index.html

). Packages sp v. 1.2–5, rgdal v. 1.2–8, gstat v. 1.1–5, raster v. 2.5–8 and latticeExtra v. 0.6–28 were also used to improve visual appearance of the maps.

Figure 2

Figure 2

(A) Interpolated geographic maps of haplotype diversity (crosses indicate sample points) and nucleotide diversity values. (B) and (C) Interpolated maps of main haplogroup frequencies across the territory of Vietnam. Maps were created as in Fig. 1.

Figure 3

Figure 3

Maximum parsimony trees based on mitogenomes representing haplogroup M7b1a1f (A) and F1f (B). The revised Cambridge reference sequence (rCRS) is shown as reference for nomenclature. Genetic variants are indicated along the branches of phylogeny as follows: all of them are transitions unless a suffix A, C, G, or T indicates a transversion, and a prefix ‘@’ indicates a back mutation. As per common practice, the trees do not consider hotspot mutations at positions 16182, 16183, and 16519, nor variation around position 310 and length or point heteroplasmies. The ID numbers in the tips of the phylogeny identify mitogenomes as indicated in Table S2; this table also show details of the geographic or ethnic origin of all the samples. (C) EBSPs of haplogroup F1f and M7b1a1f obtained from complete mitogenomes. EBSPs with 95% HPD (highest posterior density) intervals are provided in Figure S3.

Figure 4

Figure 4

PCA of Vietnamese populations analyzed in the present article versus other Asian populations (A) and versus SEA/Southern China populations (B). Haplogroup frequencies from the reference populations were taken from Zhang et al.. Note that there are two Vietnam_Kinh samples in the plot, one represents our sample from Kinh and another one that was taken from the literature.

Figure 5

Figure 5

Analysis carried out on autosomal SNPs. (A) MDS of population samples from the Indochinese Peninsula and neighboring samples. Both plots were built using the same sample sets, but the one to the right aims at highlighting the center of each population sample points in order to easy interpretation (B) Admixture analysis including reference samples from Europe (CEU) and Africa (YRI). (C) Analysis of _f3_-statistics of Vietnamese (KHV) versus different neighboring population samples. (D) _D_-statistics of Vietnamese built as follows D(CHS, KHV; Y, OUTGROUP) and D(Y, KHV; CHS OUTGROUP). (E) Estimates of admixture between Chinese and Malay, using the samples CHS and Malay as subrogates of those that contributed to the present genomic architecture of present-day Vietnamese. Estimates were statistically significant according to the ad hoc z test from ALDER.

Similar articles

Cited by

References

    1. Hall, G. & Patrinos, H. A. Indigenous peoples, poverty, and development. 304–343 (Cambridge University Press, 2012).
    1. Majumder PP. The human genetic history of South Asia. Curr Biol. 2010;20:R184–187. doi: 10.1016/j.cub.2009.11.053. - DOI - PubMed
    1. Atkinson QD, Gray RD, Drummond AJ. mtDNA variation predicts population size in humans and reveals a major Southern Asian chapter in human prehistory. Mol Biol Evol. 2008;25:468–474. doi: 10.1093/molbev/msm277. - DOI - PubMed
    1. Harris, D. R. The origins and spread of agriculture and pastoralism in Eurasia: an overview. In: Harris, D. R., editor. The origins and spread of agriculture and pastoralism in Eurasia. 552–573 (UCL Press, 1996).
    1. Chu JY, et al. Genetic relationship of populations in China. Proc Natl Acad Sci USA. 1998;95:11763–11768. doi: 10.1073/pnas.95.20.11763. - DOI - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources