The value of genomic relationship matrices to estimate levels of inbreeding (original) (raw)

Improved estimation of inbreeding and kinship in pigs using optimized SNP panels

BMC Genetics, 2013

Background: Traditional breeding programs consider an average pairwise kinship between sibs. Based on pedigree information, the relationship matrix is used for genetic evaluations disregarding variation due to Mendelian sampling. Therefore, inbreeding and kinship coefficients are either over or underestimated resulting in reduction of accuracy of genetic evaluations and genetic progress. Single nucleotide polymorphism (SNPs) can be used to estimate pairwise kinship and individual inbreeding more accurately. The aim of this study was to optimize the selection of markers and determine the required number of SNPs for estimation of kinship and inbreeding. Results: A total of 1,565 animals from three commercial pig populations were analyzed for 28,740 SNPs from the PorcineSNP60 Beadchip. Mean genomic inbreeding was higher than pedigree-based estimates in lines 2 and 3, but lower in line 1. As expected, a larger variation of genomic kinship estimates was observed for half and full sibs than for pedigree-based kinship reflecting Mendelian sampling. Genomic kinship between father-offspring pairs was lower (0.23) than the estimate based on pedigree (0.26). Bootstrap analyses using six reduced SNP panels (n = 500, 1000, 1500, 2000, 2500 and 3000) showed that 2,000 SNPs were able to reproduce the results very close to those obtained using the full set of unlinked markers (n = 7,984-10,235) with high correlations (inbreeding r > 0.82 and kinship r > 0.96) and low variation between different sets with the same number of SNPs. Conclusions: Variation of kinship between sibs due to Mendelian sampling is better captured using genomic information than the pedigree-based method. Therefore, the reduced sets of SNPs could generate more accurate kinship coefficients between sibs than the pedigree-based method. Variation of genomic kinship of father-offspring pairs is recommended as a parameter to determine accuracy of the method rather than correlation with pedigree-based estimates. Inbreeding and kinship coefficients can be estimated with high accuracy using ≥2,000 unlinked SNPs within all three commercial pig lines evaluated. However, a larger number of SNPs might be necessary in other populations or across lines.

Runs of homozygosity provide a genome landscape picture of inbreeding and genetic history of European autochthonous and commercial pig breeds

Animal Genetics, 2021

SummaryROHs are long stretches of DNA homozygous at each polymorphic position. The proportion of genome covered by ROHs and their length are indicators of the level and origin of inbreeding. Frequent common ROHs within the same population define ROH islands and indicate hotspots of selection. In this work, we investigated ROHs in a total of 1131 pigs from 20 European local pig breeds and in three cosmopolitan breeds, genotyped with the GGP Porcine HD Genomic Profiler. plink software was used to identify ROHs. Size classes and genomic inbreeding parameters were evaluated. ROH islands were defined by evaluating different thresholds of homozygous SNP frequency. A functional overview of breed‐specific ROH islands was obtained via over‐representation analyses of GO biological processes. Mora Romagnola and Turopolje breeds had the largest proportions of genome covered with ROH (~1003 and ~955 Mb respectively), whereas Nero Siciliano and Sarda breeds had the lowest proportions (~207 and 24...

Quantitative genetics model as the unifying model for defining genomic relationship and inbreeding coefficient

PloS one, 2014

The traditional quantitative genetics model was used as the unifying approach to derive six existing and new definitions of genomic additive and dominance relationships. The theoretical differences of these definitions were in the assumptions of equal SNP effects (equivalent to across-SNP standardization), equal SNP variances (equivalent to within-SNP standardization), and expected or sample SNP additive and dominance variances. The six definitions of genomic additive and dominance relationships on average were consistent with the pedigree relationships, but had individual genomic specificity and large variations not observed from pedigree relationships. These large variations may allow finding least related genomes even within the same family for minimizing genomic relatedness among breeding individuals. The six definitions of genomic relationships generally had similar numerical results in genomic best linear unbiased predictions of additive effects (GBLUP) and similar genomic REM...

Genomic structure of a crossbred Landrace pig population

PLOS ONE

Single nucleotide polymorphism (SNP) markers are used to study population structure and conservation genetics, which permits assessing similarities regarding the linkage disequilibrium and information about the relationship among individuals. To investigate the population genomic structure of 300 females and 25 males from a commercial maternal pig line we analyzed linkage disequilibrium extent, inbreeding coefficients using genomic and conventional pedigree data, and population stratification. The average linkage disequilibrium (r 2) was 0.291 ± 0.312 for all adjacent SNPs, distancing less than 100 Kb (kilobase) between markers. The average inbreeding coefficients obtained from runs of homozygosity (ROH) and pedigree analyses were 0.119 and 0.0001, respectively. Low correlation was observed between the inbreeding coefficients possibly as a result of genetic recombination effect accounted for the ROH estimates or caused by pedigree identification errors. A large number of long ROHs might indicate recent inbreeding events in the studied population. A total of 36 homozygous segments were found in more than 30% of the population and these ROH harbor genes associated with reproductive traits. The population stratification analysis indicated that this population was possibly originated from two distinct populations, which is a result from crossings between the eastern and western breeds used in the formation of the line. Our findings provide support to understand the genetic structure of swine populations and may assist breeding companies to avoid a high level of inbreeding coefficients to maintain genetic diversity, showing the effectiveness of using genomewide SNP information for quantifying inbreeding when the pedigree was incomplete or incorrect.

Genome-Wide Estimates of Coancestry and Inbreeding in a Closed Herd of Ancient Iberian Pigs

PLoS ONE, 2013

Maintaining genetic variation and controlling the increase in inbreeding are crucial requirements in animal conservation programs. The most widely accepted strategy for achieving these objectives is to maximize the effective population size by minimizing the global coancestry obtained from a particular pedigree. However, for most natural or captive populations genealogical information is absent. In this situation, microsatellites have been traditionally the markers of choice to characterize genetic variation, and several estimators of genealogical coefficients have been developed using marker data, with unsatisfactory results. The development of high-throughput genotyping techniques states the necessity of reviewing the paradigm that genealogical coancestry is the best parameter for measuring genetic diversity. In this study, the Illumina PorcineSNP60 BeadChip was used to obtain genome-wide estimates of rates of coancestry and inbreeding and effective population size for an ancient strain of Iberian pigs that is now in serious danger of extinction and for which very accurate genealogical information is available (the Guadyerbas strain). Genome-wide estimates were compared with those obtained from microsatellite and from pedigree data. Estimates of coancestry and inbreeding computed from the SNP chip were strongly correlated with genealogical estimates and these correlations were substantially higher than those between microsatellite and genealogical coefficients. Also, molecular coancestry computed from SNP information was a better predictor of genealogical coancestry than coancestry computed from microsatellites. Rates of change in coancestry and inbreeding and effective population size estimated from molecular data were very similar to those estimated from genealogical data. However, estimates of effective population size obtained from changes in coancestry or inbreeding differed. Our results indicate that genome-wide information represents a useful alternative to genealogical information for measuring and maintaining genetic diversity.

Estimation of the Inbreeding Coefficient through Use of Genomic Data

The American Journal of Human Genetics, 2003

Many linkage studies are performed in inbred populations, either small isolated populations or large populations with a long tradition of marriages between relatives. In such populations, there exist very complex genealogies with unknown loops. Therefore, the true inbreeding coefficient of an individual is often unknown. Good estimators of the inbreeding coefficient (f) are important, since it has been shown that underestimation of f may lead to false linkage conclusions. When an individual is genotyped for markers spanning the whole genome, it should be possible to use this genomic information to estimate that individual's f. To do so, we propose a maximum-likelihood method that takes marker dependencies into account through a hidden Markov model. This methodology also allows us to infer the full probability distribution of the identity-by-descent (IBD) status of the two alleles of an individual at each marker along the genome (posterior IBD probabilities) and provides a variance for the estimates. We simulate a full genome scan mimicking the true autosomal genome for (1) a first-cousin pedigree and (2) a quadruplesecond-cousin pedigree. In both cases, we find that our method accurately estimates f for different marker maps. We also find that the proportion of genome IBD in an individual with a given genealogy is very variable. The approach is illustrated with data from a study of demyelinating autosomal recessive Charcot-Marie-Tooth disease.

A Comparison of Approaches to Estimate the Inbreeding Coefficient and Pairwise Relatedness Using Genomic and Pedigree Data in a Sheep Population

PLoS ONE, 2011

Genome-wide SNP data provide a powerful tool to estimate pairwise relatedness among individuals and individual inbreeding coefficient. The aim of this study was to compare methods for estimating the two parameters in a Finnsheep population based on genome-wide SNPs and genealogies, separately. This study included ninety-nine Finnsheep in Finland that differed in coat colours (white, black, brown, grey, and black/white spotted) and were from a large pedigree comprising 319 119 animals. All the individuals were genotyped with the Illumina Ovine SNP50K BeadChip by the International Sheep Genomics Consortium. We identified three genetic subpopulations that corresponded approximately with the coat colours (grey, white, and black and brown) of the sheep. We detected a significant subdivision among the colour types (F ST = 5.4%, P,0.05). We applied robust algorithms for the genomic estimation of individual inbreeding (F SNP ) and pairwise relatedness (W SNP ) as implemented in the programs KING and PLINK, respectively. Estimates of the two parameters from pedigrees (F PED and W PED ) were computed using the RelaX2 program. Values of the two parameters estimated from genomic and genealogical data were mostly consistent, in particular for the highly inbred animals (e.g. inbreeding coefficient F.0.0625) and pairs of closely related animals (e.g. the full-or half-sibs). Nevertheless, we also detected differences in the two parameters between the approaches, particularly with respect to the grey Finnsheep. This could be due to the smaller sample size and relative incompleteness of the pedigree for them. We conclude that the genome-wide genomic data will provide useful information on a per sample or pairwise-samples basis in cases of complex genealogies or in the absence of genealogical data. Citation: Li M-H, Strandén I, Tiirikka T, Sevó n-Aimonen M-L, Kantanen J (2011) A Comparison of Approaches to Estimate the Inbreeding Coefficient and Pairwise

Estimation of inbreeding using pedigree, 50k SNP chip genotypes and full sequence data in three cattle breeds

BMC Genetics, 2015

Background: Levels of inbreeding in cattle populations have increased in the past due to the use of a limited number of bulls for artificial insemination. High levels of inbreeding lead to reduced genetic diversity and inbreeding depression. Various estimators based on different sources, e.g., pedigree or genomic data, have been used to estimate inbreeding coefficients in cattle populations. However, the comparative advantage of using full sequence data to assess inbreeding is unknown. We used pedigree and genomic data at different densities from 50k to full sequence variants to compare how different methods performed for the estimation of inbreeding levels in three different cattle breeds. Results: Five different estimates for inbreeding were calculated and compared in this study: pedigree based inbreeding coefficient (F PED); run of homozygosity (ROH)-based inbreeding coefficients (F ROH); genomic relationship matrix (GRM)-based inbreeding coefficients (F GRM); inbreeding coefficients based on excess of homozygosity (F HOM) and correlation of uniting gametes (F UNI). Estimates using ROH provided the direct estimated levels of autozygosity in the current populations and are free effects of allele frequencies and incomplete pedigrees which may increase in inaccuracy in estimation of inbreeding. The highest correlations were observed between F ROH estimated from the full sequence variants and the F ROH estimated from 50k SNP (single nucleotide polymorphism) genotypes. The estimator based on the correlation between uniting gametes (F UNI) using full genome sequences was also strongly correlated with F ROH detected from sequence data. Conclusions: Estimates based on ROH directly reflected levels of homozygosity and were not influenced by allele frequencies, unlike the three other estimates evaluated (F GRM , F HOM and F UNI), which depended on estimated allele frequencies. F PED suffered from limited pedigree depth. Marker density affects ROH estimation. Detecting ROH based on 50k chip data was observed to give estimates similar to ROH from sequence data. In the absence of full sequence data ROH based on 50k can be used to access homozygosity levels in individuals. However, genotypes denser than 50k are required to accurately detect short ROH that are most likely identical by descent (IBD).