Predicted genetic burden and frequency of phenotype-associated variants in the horse - PubMed (original) (raw)
Predicted genetic burden and frequency of phenotype-associated variants in the horse
S A Durward-Akhurst et al. Sci Rep. 2024.
Abstract
Disease-causing variants have been identified for less than 20% of suspected equine genetic diseases. Whole genome sequencing (WGS) allows rapid identification of rare disease causal variants. However, interpreting the clinical variant consequence is confounded by the number of predicted deleterious variants that healthy individuals carry (predicted genetic burden). Estimation of the predicted genetic burden and baseline frequencies of known deleterious or phenotype associated variants within and across the major horse breeds have not been performed. We used WGS of 605 horses across 48 breeds to identify 32,818,945 variants, demonstrate a high predicted genetic burden (median 730 variants/horse, interquartile range: 613-829), show breed differences in predicted genetic burden across 12 target breeds, and estimate the high frequencies of some previously reported disease variants. This large-scale variant catalog for a major and highly athletic domestic animal species will enhance its ability to serve as a model for human phenotypes and improves our ability to discover the bases for important equine phenotypes.
© 2024. The Author(s).
Conflict of interest statement
S.A. Durward-Akhurst, J.L. Marlowe, R.J. Schaefer, K. Springer, M.E. McCue and J.R. Mickelson declare no competing interests. B. Grantham and W.K. Carey own IntervalBio LLC, the computational company that was paid to map and perform the variant calling on the original 534 horses. R.R. Bellone is affiliated with the UC Davis Veterinary Genetics Laboratory, which provides genetic diagnostic tests in horses and other species.
Figures
Figure 1
Relationship between the number of variants identified and the depth of coverage. (a) The correlation between the number of variants identified in the 605 horses and the WGS depth of coverage (DOC). The blue line represents the non-linear correlation between the number of variants identified and DOC, with grey shadowing representing the 95% confidence intervals around the mean. The breed EMMEAN is represented by a colored line. Breeds comprised: Arabian (Arab), Belgian (Belg), Clydesdale (Clyd), Franches-Montagnes (FM), Icelandic (Ice), Morgan (Morg), Other breeds (Oth), Quarter Horse (QH), Shetland (Shet), Standardbred (STB), Thoroughbred (TB), Warmblood (WB), and Welsh Pony (WP). (b) Estimated Marginal Mean (EMMEAN) for the linear regression between depth of coverage (DOC) and breed. c. EMMEAN for the linear regression between the number of variants by breed, accounting for DOC. For b and c the vertical colored lines represent the EMMEAN and the purple horizontal bands represent the 95% confidence limits around the EMMEAN.
Figure 2
Estimated Marginal Means (EMMEANS) of the predicted genetic burden, the LOF burden, and the homozygous predicted genetic burden and LOF burden. EMMEANs (black circle) and 95% confidence interval (purple shaded line) for all predicted genetic burden variants (a), homozygous predicted genetic burden variants (b), all LOF variants (c), and homozygous LOF variants (d) in the 12 target breeds and other horses.
Figure 3
Pearson’s correlation coefficient estimates between the predicted genetic burden and LOF variants and estimates of breed Nes. The round points represent the Pearson’s correlation coefficient estimate with the 95% confidence interval represented by the error bars. Orange represents Ne estimates based on 54 K array data (JP) and teal represents Ne estimates based on 2 million array data (SB). The variant types are the predicted genetic burden (GB), homozygous predicted genetic burden (hom_GB), homozygous LOF (hom_LOF), and loss of function (LOF) variants. *represents significant correlation between the variant type and the estimated Ne.
Figure 4
Known variants identified in the 605-horse population. Genotype count of and whether the variants are present in heterozygous (regular shading) or homozygous (diagonal striped shading) states for known disease and non-coat color trait causing variants (a), coat color associated and causative variants (b), disease associated variants (c), and non-disease and non-coat color trait associated variants (d) for each of the 12 target breeds and the other breed group. The phenotype abbreviations are detailed in Supplementary table 6.
References
- Petersen JL, Mickelson JR, Rendahl AK, Valberg SJ, Andersson LS, Axelsson J, Bailey E, Bannasch D, Binns MM, Borges AS, Brama P, da Camara MA, Capomaccio S, Cappelli K, Cothran EG, Distl O, Fox-Clipsham L, Graves KT, Guerin G, Haase B, Hasegawa T, Hemmann K, Hill EW, Leeb T, Lindgren G, Lohi H, Lopes MS, McGivney BA, Mikko S, Orr N, Penedo MC, Piercy RJ, Raekallio M, Rieder S, Roed KH, Swinburne J, Tozaki T, Vaudin M, Wade CM, McCue ME. Genome-wide analysis reveals selection for important traits in domestic horse breeds. PLoS Genet. 2013;9(1):e1003211. doi: 10.1371/journal.pgen.1003211. -DOI -PMC -PubMed
- Hill EW, McGivney BA, Gu J, Whiston R, Machugh DE. A genome-wide SNP-association study confirms a sequence variant (g.66493737C>T) in the equine myostatin (MSTN) gene as the most powerful predictor of optimum racing distance for Thoroughbred racehorses. BMC Genom. 2010;11:552–552. doi: 10.1186/1471-2164-11-552. -DOI -PMC -PubMed
MeSH terms
Grants and funding
- 2017-67015-26296/USDA NIFA-AFRI
- 2017-67015-26296/USDA NIFA-AFRI
- 2017-67015-26296/USDA NIFA-AFRI
- 5T320D010993-12/Office of Research Infrastructure Programs, National Institutes of Health
- 5T320D010993-12/Office of Research Infrastructure Programs, National Institutes of Health
- D20EQ-403/Morris Animal Foundation
LinkOut - more resources
Full Text Sources
Other Literature Sources