Major factors influencing linkage disequilibrium by analysis of different chromosome regions in distinct populations: demography, chromosome recombination frequency and selection (original) (raw)

The linkage disequilibrium maps of three human chromosomes across four populations reflect their demographic history and a common underlying recombination pattern

Genome Research, 2005

The extent and patterns of linkage disequilibrium (LD) determine the feasibility of association studies to map genes that underlie complex traits. Here we present a comparison of the patterns of LD across four major human populations (African-American, Caucasian, Chinese, and Japanese) with a high-resolution single-nucleotide polymorphism (SNP) map covering almost the entire length of chromosomes 6, 21, and 22. We constructed metric LD maps formulated such that the units measure the extent of useful LD for association mapping. LD reaches almost twice as far in chromosome 6 as in chromosomes 21 or 22, in agreement with their differences in recombination rates. By all measures used, out-of-Africa populations showed over a third more LD than African-Americans, highlighting the role of the population's demography in shaping the patterns of LD. Despite those differences, the long-range contour of the LD maps is remarkably similar across the four populations, presumably reflecting common localization of recombination hot spots. Our results have practical implications for the rational design and selection of SNPs for disease association studies.

Linkage disequilibrium patterns vary substantially among populations

European Journal of Human Genetics, 2005

A major initiative to create a global human haplotype map has recently been launched as a tool to improve the efficiency of disease gene mapping. The 'HapMap' project will study common variants in depth in four (and to a lesser degree in up to 12) populations to catalogue haplotypes that are expected to be common to all populations. A hope of the 'HapMap' project is that much of the genome occurs in regions of limited diversity such that only a few of the SNPs in each region will capture the diversity and be relevant around the world. In order to explore the implications of studying only a limited number of populations, we have analyzed linkage disequilibrium (LD) patterns of three 175-320 kb genomic regions in 16 diverse populations with an emphasis on African and European populations. Analyses of these three genomic regions provide empiric demonstration of marked differences in frequencies of the same few haplotypes, resulting in differences in the amount of LD and very different sets of haplotype frequencies. These results highlight the distinction between the statistical concept of LD and the biological reality of haplotypes and their frequencies. The significant quantitative and qualitative variation in LD among populations, even for populations within a geographic region, emphasizes the importance of studying diverse populations in the HapMap project to assure broad applicability of the results.

Extent of linkage disequilibrium in a Sardinian sub-isolate: sampling and methodological considerations

Human Molecular Genetics, 2003

The extent of linkage disequilibrium (LD) is an important factor when designing experiments for mapping disease or trait loci using LD mapping methods. It depends on the population history and hence is a characteristic of each population. Here, we have assessed the extent of LD in a sub-isolate of the general Sardinian population (775 members of one village) using 22 polymorphic markers on chromosome 19. We found high levels of disequilibrium that extended to 8 cM, when based on D 0 , and 11 cM when based on the significance level of the allelic association. The fact that conclusions based on both methods are similar suggests that the estimates are quite robust. We have also shown, through a simple resampling technique, that small sample sizes can overestimate both the mean value of D 0 and its variance up to a factor of about 2 and 16, respectively, when the number of diplotypes (the pair of haplotypes that compose the genotype) decreased from 186 to 26. We evaluated the effect on D 0 of the depth of the pedigree available when using phased founders, and compared the estimates with those obtained when using unphased founders, and also the effect of grouping alleles on the value of D 0 and the significance level. Owing to the high sampling variance of LD, we recommend the use of at least 200 unrelated individuals when characterizing the extent of LD.

X-chromosome as a marker for population history: linkage disequilibrium and haplotype study in Eurasian populations

European Journal of Human Genetics, 2004

Linkage disequilibrium (LD) structure is still unpredictable because the interplay of regional recombination rate and demographic history is poorly understood. We have compared the distribution of LD across two genomic regions differing in crossing-over activity-Xq13 (0.166 cM/Mb) and Xp22 (1.3 cM/Mb)-in 15 Eurasian populations. Demographic events predicted to increase the LD level-genetic drift, bottleneck and admixture-had a very strong impact on extent and patterns of regional LD across Xq13 compared to Xp22. The haplotype distribution of the DXS1225-DXS8082 microsatellites from Xq13 exhibiting strong association in all populations was remarkably influenced by population history. European populations shared one common haplotype with a frequency of 25-40%. The Volga-Ural populations studied, living at the geographic borderline of Europe, showed elevated LD as well as harboring a significant fraction of haplotypes originating from East Asia, thus reflecting their past migrations and admixture. In the young Kuusamo isolate from Finland, a bottleneck has led to allelic associations between loci and shifted the haplotype distribution, but has much less affected single microsatellite allele frequencies compared to the main Finnish population. The data show that the footprint of a demographic event is longer preserved in haplotype distribution within a region of low crossing-over rate, than in the information content of a single marker, or between actively recombining markers. As the knowledge of LD patterns is often chosen to assist association mapping of common disease, our conclusions emphasize the importance of understanding the history, structure and variation of a study population.

Linkage disequilibrium in young genetically isolated Dutch population

European Journal of Human Genetics, 2004

The design and feasibility of genetic studies of complex diseases are critically dependent on the extent and distribution of linkage disequilibrium (LD) across the genome and between different populations. We have examined genomewide and region-specific LD in a young genetically isolated population identified in the Netherlands by genotyping approximately 800 Short Tandem Repeat markers distributed genomewide across 58 individuals. Several regions were analyzed further using a denser marker map. The permutationcorrected measure of LD was used for analysis. A significant (Po0.0004) relation between LD and genetic distance on a genomewide scale was found. Distance explained 4% of the total LD variation. For finemapping data, distance accounted for a larger proportion of LD variation (up to 39%). A notable similarity in the genomewide distribution of LD was revealed between this population and other young genetically isolated populations from Micronesia and Costa Rica. Our study population and experiment was simulated in silico to confirm our knowledge of the history of the population. High agreement was observed between results of analysis of simulated and empirical data. We conclude that our population shows a high level of LD similar to that demonstrated previously in other young genetic isolates. In Europe, there may be a large number of young genetically isolated populations that are similar in history to ours. In these populations, a similar degree of LD is expected and thus they may be effectively used for linkage or LD mapping.

Linkage disequilibrium in isolated populations: Finland and a young sub-population of Kuusamo

European Journal of Human Genetics, 2000

Linkage disequilibrium (LD), non-random association of alleles at closely linked chromosomal loci, has been used as a tool in the identification of disease alleles, and this has led to an improved understanding of pathology in many monogenic Mendelian human diseases. We are currently moving from the mapping and identification of monogenic disease loci to attempts at identifying loci involved in predisposition to multifactorial diseases. In the selection of ascertainment strategies in the studies of these complex diseases, the extent of background LD in different populations is an important consideration. Here, we compare the extent of LD among the alleles of linked loci in a randomly ascertained sample of individuals from the Finnish population and a set of individuals ascertained from the region of Kuusamo, a small sub-population, founded some 13 generations ago, which has experienced very little subsequent immigration. Thirty-three microsatellite loci were genotyped in chromosomal regions on 13q, 19q, 21q, Xq, and Xp. The genetic diversity of these loci was determined separately in the general Finnish sample and in the Kuusamo sample. The X-chromosomal loci are characterised by higher levels of LD in the samples from Kuusamo than in the much larger (and older) general population of Finland, whereas in alleles of autosomal loci very little LD was seen in either of these two samples.

The Extent of Linkage Disequilibrium in Four Populations with Distinct Demographic Histories

The American Journal of Human Genetics, 2000

The design and feasibility of whole-genome-association studies are critically dependent on the extent of linkage disequilibrium (LD) between markers. Although there has been extensive theoretical discussion of this, few empirical data exist. The authors have determined the extent of LD among 38 biallelic markers with minor allele frequencies 1.1, since these are most comparable to the common disease-susceptibility polymorphisms that association studies aim to detect. The markers come from three chromosomal regions-1,335 kb on chromosome 13q12-13, 380 kb on chromosome 19q13.2, and 120 kb on chromosome 22q13.3-which have been extensively mapped. These markers were examined in ∼1,600 individuals from four populations, all of European origin but with different demographic histories; Afrikaners, Ashkenazim, Finns, and East Anglian British. There are few differences, either in allele frequencies or in LD, among the populations studied. A similar inverse relationship was found between LD and distance in each genomic region and in each population. Mean D is .68 for marker pairs !5 kb apart and is .24 for pairs separated by 10-20 kb, and the level of LD is not different from that seen in unlinked marker pairs separated by 1500 kb. However, only 50% of marker pairs at distances !5 kb display sufficient LD ( ) to be D 1 .3 useful in association studies. Results of the present study, if representative of the whole genome, suggest that a whole-genome scan searching for common disease-susceptibility alleles would require markers spaced р5 kb apart.

The interval of linkage disequilibrium (LD) detected with microsatellite and SNP markers in chromosomes of Finnish populations with different histories

Human Molecular Genetics, 2003

Linkage disequilibrium (LD) has been an efficient tool for fine mapping of monogenic disease genes in population isolates. Its usefulness for identification of predisposing loci for common, polygenic diseases has been challenged on the basis of anticipated allelic and locus heterogeneity. We compared the extent of LD among marker loci in Finnish subpopulations with divergent but well-characterized histories. One study sample represents the early settlement Finnish population, descended from two immigration events 4000 and 2000 years ago. The second sample represents the geographically large late settlement region, populated 15 generations ago by several small immigrant groups from the early settlement region. The third is a restricted regional subpopulation in northeastern Finland which was founded 12 generations ago by 39 immigrant families from the late settlement region. We genotyped 243 microsatellite markers and 68 single nucleotide polymorphisms (SNPs) on chromosomes 1q and 5q. The genealogy of the families from the early (n ¼ 16) and late settlements (n ¼ 54) and the isolated settlement (n ¼ 54) was studied in detail back to the 1800s. Microsatellite data revealed greater LD in the young, founder subpopulation than was seen in either of the older populations. Observed linkage disequilibrium correlated not only with physical distance between markers but also with the information content of the markers. Using biallelic SNP markers, significant LD could only be detected up to 0.1 cM. Our results demonstrate the complexity of the concept of 'detectable LD' and emphasize the importance of understanding population history when designing a strategy for disease gene mapping.