Characterization of Greater Middle Eastern genetic variation for enhanced disease gene discovery (original) (raw)

The influence of admixture and consanguinity on population genetic diversity in Middle East

Journal of Human Genetics, 2014

The Middle East (ME) is an important crossroad where modern humans migrated 'out of Africa' and spread into Europe and Asia. After the initial peopling and long-term isolation leading to well-differentiated populations, the ME also had a crucial role in subsequent human migrations among Africa, Europe and Asia; thus, recent population admixture has been common in the ME. On the other hand, consanguinity, a well-known practice in the ME, often reduces genetic diversity and works in opposition to admixture. Here, we explored the degree to which admixture and consanguinity jointly affected genetic diversity in ME populations. Genome-wide single-nucleotide polymorphism data were generated in two representative ME populations (Arabian and Iranian), with comparisons made with populations worldwide. Our results revealed an overall higher genetic diversity in both ME populations relative to other non-African populations. We identified a much larger number of long runs of homozygosity in ME populations than in any other populations, which was most likely attributed to high levels of consanguineous marriages that significantly decreased both individual and population heterozygosity. Additionally, we were able to distinguish African, European and Asian ancestries in ME populations and quantify the impact of admixture and consanguinity with statistical approaches. Interestingly, genomic regions with significantly excessive ancestry from individual source populations are functionally enriched in olfactory pathways, which were suspected to be under natural selection. Our findings suggest that genetic admixture, consanguinity and natural selection have collectively shaped the genetic diversity of ME populations, which has important implications in both evolutionary studies and medical practices.

Fine-scale population structure reveals high genetic heterogeneity of the Kuwaiti population in the Arabian Peninsula

Recent studies have showed the diverse genetic architecture of the highly consanguineous populations inhabiting the Arabian Peninsula. Consanguinity coupled with heterogeneity is complex and makes it difficult to understand the bases of population-specific genetic diseases in the region. Therefore, comprehensive genetic characterization of the populations at the finest scale is warranted. Here, we revisit the genetic structure of the Kuwait population by analyzing genome-wide single nucleotide polymorphisms data from 583 Kuwaiti individuals sorted into three subgroups. We envisage a diverse demographic genetic history among the three subgroups based on drift and allelic sharing with modern and ancient individuals. Furthermore, our comprehensive haplotype-based analyses disclose a high genetic heterogeneity among the Kuwaiti populations. We infer the major sources of ancestry within the newly defined groups; one with an obvious predominance of sub-Saharan/Western Africa mostly compri...

Thousands of Qatari genomes inform human migration history and improve imputation of Arab haplotypes

Nature Communications

Arab populations are largely understudied, notably their genetic structure and history. Here we present an in-depth analysis of 6,218 whole genomes from Qatar, revealing extensive diversity as well as genetic ancestries representing the main founding Arab genealogical lineages of Qahtanite (Peninsular Arabs) and Adnanite (General Arabs and West Eurasian Arabs). We find that Peninsular Arabs are the closest relatives of ancient hunter-gatherers and Neolithic farmers from the Levant, and that founder Arab populations experienced multiple splitting events 12–20 kya, consistent with the aridification of Arabia and farming in the Levant, giving rise to settler and nomadic communities. In terms of recent genetic flow, we show that these ancestries contributed significantly to European, South Asian as well as South American populations, likely as a result of Islamic expansion over the past 1400 years. Notably, we characterize a large cohort of men with the ChrY J1a2b haplogroup (n = 1,491)...

Arab gene geography: From population diversities to personalized medical genomics

Global cardiology science & practice, 2014

Genetic disorders are not equally distributed over the geography of the Arab region. While a number of disorders have a wide geographical presence encompassing 10 or more Arab countries, almost half of these disorders occur in a single Arab country or population. Nearly, one-third of the genetic disorders in Arabs result from congenital malformations and chromosomal abnormalities, which are also responsible for a significant proportion of neonatal and perinatal deaths in Arab populations. Strikingly, about two-thirds of these diseases in Arab patients follow an autosomal recessive mode of inheritance. High fertility rates together with increased consanguineous marriages, generally noticed in Arab populations, tend to increase the rates of genetic and congenital abnormalities. Many of the nearly 500 genes studied in Arab people revealed striking spectra of heterogeneity with many novel and rare mutations causing large arrays of clinical outcomes. In this review we provided an overvie...

Genomic Patterns of Homozygosity in Worldwide Human Populations

The American Journal of Human Genetics, 2012

Genome-wide patterns of homozygosity runs and their variation across individuals provide a valuable and often untapped resource for studying human genetic diversity and evolutionary history. Using genotype data at 577,489 autosomal SNPs, we employed a likelihoodbased approach to identify runs of homozygosity (ROH) in 1,839 individuals representing 64 worldwide populations, classifying them by length into three classes-short, intermediate, and long-with a model-based clustering algorithm. For each class, the number and total length of ROH per individual show considerable variation across individuals and populations. The total lengths of short and intermediate ROH per individual increase with the distance of a population from East Africa, in agreement with similar patterns previously observed for locus-wise homozygosity and linkage disequilibrium. By contrast, total lengths of long ROH show large interindividual variations that probably reflect recent inbreeding patterns, with higher values occurring more often in populations with known high frequencies of consanguineous unions. Across the genome, distributions of ROH are not uniform, and they have distinctive continental patterns. ROH frequencies across the genome are correlated with local genomic variables such as recombination rate, as well as with signals of recent positive selection. In addition, long ROH are more frequent in genomic regions harboring genes associated with autosomal-dominant diseases than in regions not implicated in Mendelian diseases. These results provide insight into the way in which homozygosity patterns are produced, and they generate baseline homozygosity patterns that can be used to aid homozygosity mapping of genes associated with recessive diseases.

Genome-wide inbreeding estimation within Lebanese communities using SNP arrays

European Journal of Human Genetics, 2014

Consanguineous marriages have been widely practiced in several global communities with varying rates depending on religion, culture, and geography. In consanguineous marriages, parents pass to their children autozygous segments known as homozygous by descent segments. In this study, single-nucleotide polymorphisms were analyzed in 165 unrelated Lebanese people from Greek Orthodox, Maronite, Shiite and Sunni communities. Runs of homozygosity, total inbreeding levels, remote consanguinity, and population admixture and structure were estimated. The inbreeding coefficient value was estimated to be 1.61% in offspring of unrelated parents over three generations and 8.33% in offspring of first cousins. From these values, remote consanguinity values, resulting from genetic drift or recurrent consanguineous unions, were estimated in offspring of unrelated and first-cousin parents to be 0.61 and 1.2%, respectively. This remote consanguinity value suggests that for any unrelated marriages in Lebanon, the mates could be related as third cousins or as second cousins once removed. Under the assumption that 25% of marriages occur between first cousins, the mean inbreeding value of 2.3% may explain the increased incidence of recessive disease in offspring. Our analysis reveals a common ancestral population in the four Lebanese communities we studied.

Distinct genetic variation and heterogeneity of the Iranian population

PLOS Genetics

Iran, despite its size, geographic location and past cultural influence, has largely been a blind spot for human population genetic studies. With only sparse genetic information on the Iranian population available, we pursued its genome-wide and geographic characterization based on 1021 samples from eleven ethnic groups. We show that Iranians, while close to neighboring populations, present distinct genetic variation consistent with long-standing genetic continuity, harbor high heterogeneity and different levels of consanguinity, fall apart into a cluster of similar groups and several admixed ones and have experienced numerous language adoption events in the past. Our findings render Iran an important source for human genetic variation in Western and Central Asia, will guide adequate study sampling and assist the interpretation of putative disease-implicated genetic variation. Given Iran's internal genetic heterogeneity, future studies will have to consider ethnic affiliations and possible admixture.

Middle Eastern Genetic Variation Improves Clinical Annotation of the Human Genome

2021

Genetic variation in populations of Middle Eastern origin remains highly underrepresented in most comprehensive genomic databases. This underrepresentation hampers the functional annotation of the human genome and challenges accurate clinical variant interpretation. To highlight the importance of capturing genetic variation in the Middle East, we aggregated whole exome and genome sequencing data from 2116 individuals in the Middle East and established the Middle East Variation (MEV) database. Of the high-impact coding (missense and loss of function) variants in this database, 53% were absent from the most comprehensive Genome Aggregation Database (gnomAD), thus representing a unique Middle Eastern variation dataset which might directly impact clinical variant interpretation. We highlight 39 variants with minor allele frequency >1% in the MEV database that were previously reported as rare disease variants in ClinVar and the Human Gene Mutation Database (HGMD). Furthermore, the MEV...

Y-Chromosome and mtDNA Genetics Reveal Significant Contrasts in Affinities of Modern Middle Eastern Populations with European and African Populations

PLoS ONE, 2013

The Middle East was a funnel of human expansion out of Africa, a staging area for the Neolithic Agricultural Revolution, and the home to some of the earliest world empires. Post LGM expansions into the region and subsequent population movements created a striking genetic mosaic with distinct sex-based genetic differentiation. While prior studies have examined the mtDNA and Y-chromosome contrast in focal populations in the Middle East, none have undertaken a broadspectrum survey including North and sub-Saharan Africa, Europe, and Middle Eastern populations. In this study 5,174 mtDNA and 4,658 Y-chromosome samples were investigated using PCA, MDS, mean-linkage clustering, AMOVA, and Fisher exact tests of F ST 's, R ST 's, and haplogroup frequencies. Geographic differentiation in affinities of Middle Eastern populations with Africa and Europe showed distinct contrasts between mtDNA and Y-chromosome data. Specifically, Lebanon's mtDNA shows a very strong association to Europe, while Yemen shows very strong affinity with Egypt and North and East Africa. Previous Y-chromosome results showed a Levantine coastal-inland contrast marked by J1 and J2, and a very strong North African component was evident throughout the Middle East. Neither of these patterns were observed in the mtDNA. While J2 has penetrated into Europe, the pattern of Y-chromosome diversity in Lebanon does not show the widespread affinities with Europe indicated by the mtDNA data. Lastly, while each population shows evidence of connections with expansions that now define the Middle East, Africa, and Europe, many of the populations in the Middle East show distinctive mtDNA and Y-haplogroup characteristics that indicate long standing settlement with relatively little impact from and movement into other populations.

Gene flow from North Africa contributes to differential human genetic diversity in southern Europe

Proceedings of the National Academy of Sciences, 2013

Human genetic diversity in southern Europe is higher than in other regions of the continent. This difference has been attributed to postglacial expansions, the demic diffusion of agriculture from the Near East, and gene flow from Africa. Using SNP data from 2,099 individuals in 43 populations, we show that estimates of recent shared ancestry between Europe and Africa are substantially increased when gene flow from North Africans, rather than Sub-Saharan Africans, is considered. The gradient of North African ancestry accounts for previous observations of low levels of sharing with Sub-Saharan Africa and is independent of recent gene flow from the Near East. The source of genetic diversity in southern Europe has important biomedical implications; we find that most disease risk alleles from genome-wide association studies follow expected patterns of divergence between Europe and North Africa, with the principal exception of multiple sclerosis.