Recombination rate estimation in the presence of hotspots - PubMed (original) (raw)
Recombination rate estimation in the presence of hotspots
Adam Auton et al. Genome Res. 2007 Aug.
Abstract
Fine-scale estimation of recombination rates remains a challenging problem. Experimental techniques can provide accurate estimates at fine scales but are technically challenging and cannot be applied on a genome-wide scale. An alternative source of information comes from patterns of genetic variation. Several statistical methods have been developed to estimate recombination rates from randomly sampled chromosomes. However, most such methods either make poor assumptions about recombination rate variation, or simply assume that there is no rate variation. Since the discovery of recombination hotspots, it is clear that recombination rates can vary over many orders of magnitude at the fine scale. We present a method for the estimation of recombination rates in the presence of recombination hotspots. We demonstrate that the method is able to detect and accurately quantify recombination rate heterogeneity, and is a substantial improvement over a commonly used method. We then use the method to reanalyze genetic variation data from the HLA and MS32 regions of the human genome and demonstrate that the method is able to provide accurate rate estimates and simultaneously detect hotspots.
Figures
Figure 1.
Illustration of the priors of LDhat (A) and rhomap (B). Shown here are individual realizations of the priors for a 200-kb region. Note the difference in the _Y_-axis scales.
Figure 2.
Deviation of the estimated total ρ from the simulated value. Rate estimates from the constant rate simulations (Simulation Study A) using LDhat and rhomap are shown in A and B, respectively. Rate estimates from the variable rate simulations (Simulation Study B) using LDhat and rhomap are shown in C and D, respectively.
Figure 3.
Average recombination rate estimates from 100 simulated data sets. (A) Results from Simulation Study A with a constant recombination rate. (B) Results from Simulation Study C with an active central hotspot. Rate estimates from LDhat and rhomap are shown as thick red and blue lines, respectively. The simulated recombination profile is shown in black. The 2.5th and 97.5th percentiles of the estimated rates are shown in faded colors. Note that, for clarity, the constant rate simulation estimates are shown on a linear scale, whereas the hotspot simulation estimates are shown on a logarithmic scale.
Figure 4.
Results from Simulation Study B. Scatter plot of simulated rate versus estimated rate for LDhat (A) and rhomap (B). Each point represents an estimate of recombination rate between two adjacent SNPs. A 250-point moving average is also shown.
Figure 5.
Results from Simulation Study B. (A) Correlation coefficient between the log10 estimated rate and the log10 simulated rate for 100 data sets, as measured over SNP intervals. The correlation coefficients obtained using rate estimates from LDhat are shown on the vertical axis, and the coefficients obtained using rhomap are shown on the horizontal axis. (B) Using rhomap as a hotspot detection tool in the variable rate simulation study. This plot shows the power of rhomap to detect recombination hotspots (thick line) and the false-discovery rate (thin line). Hotspots were called if the average number of hotspots per sample per kb at a local maxima was above the threshold shown on the horizontal axis. The hotspot was considered to be correctly detected if it was within 1.5 kb of the location of a simulated hotspot. Otherwise, the hotspot was considered a false positive.
Figure 6.
Output of rhomap for the HLA and MS32 regions. Plots A and C show the recombination rate estimates of the HLA and MS32 regions respectively, with the estimated rate in blue, and (sex-averaged) sperm typing rate in red. SNP locations are shown as red marks. Estimates from rhomap were converted to cM/Mb by assuming _N_e = 10,000. Also shown in plot C is the detail of the NID2a/b and MSTM1a/b estimates. Plots B and D show the average number of hotspots per sample per kb for the same regions.
Similar articles
- Comparison of fine-scale recombination rates in humans and chimpanzees.
Winckler W, Myers SR, Richter DJ, Onofrio RC, McDonald GJ, Bontrop RE, McVean GA, Gabriel SB, Reich D, Donnelly P, Altshuler D. Winckler W, et al. Science. 2005 Apr 1;308(5718):107-11. doi: 10.1126/science.1105322. Epub 2005 Feb 10. Science. 2005. PMID: 15705809 - Application of coalescent methods to reveal fine-scale rate variation and recombination hotspots.
Fearnhead P, Harding RM, Schneider JA, Myers S, Donnelly P. Fearnhead P, et al. Genetics. 2004 Aug;167(4):2067-81. doi: 10.1534/genetics.103.021584. Genetics. 2004. PMID: 15342541 Free PMC article. - Evidence for substantial fine-scale variation in recombination rates across the human genome.
Crawford DC, Bhangale T, Li N, Hellenthal G, Rieder MJ, Nickerson DA, Stephens M. Crawford DC, et al. Nat Genet. 2004 Jul;36(7):700-6. doi: 10.1038/ng1376. Epub 2004 Jun 6. Nat Genet. 2004. PMID: 15184900 - Estimating recombination rates from population-genetic data.
Stumpf MP, McVean GA. Stumpf MP, et al. Nat Rev Genet. 2003 Dec;4(12):959-68. doi: 10.1038/nrg1227. Nat Rev Genet. 2003. PMID: 14631356 Review. - Insights into recombination from population genetic variation.
Hellenthal G, Stephens M. Hellenthal G, et al. Curr Opin Genet Dev. 2006 Dec;16(6):565-72. doi: 10.1016/j.gde.2006.10.001. Epub 2006 Oct 16. Curr Opin Genet Dev. 2006. PMID: 17049225 Review.
Cited by
- Tree sequences as a general-purpose tool for population genetic inference.
Whitehouse LS, Ray D, Schrider DR. Whitehouse LS, et al. bioRxiv [Preprint]. 2024 Oct 5:2024.02.20.581288. doi: 10.1101/2024.02.20.581288. bioRxiv. 2024. PMID: 39185244 Free PMC article. Updated. Preprint. - The Unreasonable Effectiveness of Convolutional Neural Networks in Population Genetic Inference.
Flagel L, Brandvain Y, Schrider DR. Flagel L, et al. Mol Biol Evol. 2019 Feb 1;36(2):220-238. doi: 10.1093/molbev/msy224. Mol Biol Evol. 2019. PMID: 30517664 Free PMC article. - Coalescence and Linkage Disequilibrium in Facultatively Sexual Diploids.
Hartfield M, Wright SI, Agrawal AF. Hartfield M, et al. Genetics. 2018 Oct;210(2):683-701. doi: 10.1534/genetics.118.301244. Epub 2018 Aug 10. Genetics. 2018. PMID: 30097538 Free PMC article. - Fine human genetic map based on UK10K data set.
Hao Z, Du P, Pan YH, Li H. Hao Z, et al. Hum Genet. 2022 Feb;141(2):273-281. doi: 10.1007/s00439-021-02415-8. Epub 2022 Jan 20. Hum Genet. 2022. PMID: 35048190 - The recombination landscapes of spiny lizards (genus Sceloporus).
Versoza CJ, Rivera JA, Rosenblum EB, Vital-García C, Hews DK, Pfeifer SP. Versoza CJ, et al. G3 (Bethesda). 2022 Feb 4;12(2):jkab402. doi: 10.1093/g3journal/jkab402. G3 (Bethesda). 2022. PMID: 34878100 Free PMC article.
References
- Fearnhead P. Consistency of estimators of the population-scaled recombination rate. Theor. Popul. Biol. 2003;64:67–79. - PubMed
- Fearnhead P. SequenceLDhot: Detecting recombination hotspots. Bioinformatics. 2006;22:3061–3066. - PubMed
- Greenawalt D.M., Cui X., Wu Y., Lin Y., Wang H.Y., Luo M., Tereshchenko I.V., Hu G., Li J.Y., Chu Y., Cui X., Wu Y., Lin Y., Wang H.Y., Luo M., Tereshchenko I.V., Hu G., Li J.Y., Chu Y., Wu Y., Lin Y., Wang H.Y., Luo M., Tereshchenko I.V., Hu G., Li J.Y., Chu Y., Lin Y., Wang H.Y., Luo M., Tereshchenko I.V., Hu G., Li J.Y., Chu Y., Wang H.Y., Luo M., Tereshchenko I.V., Hu G., Li J.Y., Chu Y., Luo M., Tereshchenko I.V., Hu G., Li J.Y., Chu Y., Tereshchenko I.V., Hu G., Li J.Y., Chu Y., Hu G., Li J.Y., Chu Y., Li J.Y., Chu Y., Chu Y., et al. Strong correlation between meiotic crossovers and haplotype structure in a 2.5-Mb region on the long arm of chromosome 21. Genome Res. 2006;16:208–214. - PMC - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Research Materials