Recombination rate estimation in the presence of hotspots - PubMed (original) (raw)

Recombination rate estimation in the presence of hotspots

Adam Auton et al. Genome Res. 2007 Aug.

Abstract

Fine-scale estimation of recombination rates remains a challenging problem. Experimental techniques can provide accurate estimates at fine scales but are technically challenging and cannot be applied on a genome-wide scale. An alternative source of information comes from patterns of genetic variation. Several statistical methods have been developed to estimate recombination rates from randomly sampled chromosomes. However, most such methods either make poor assumptions about recombination rate variation, or simply assume that there is no rate variation. Since the discovery of recombination hotspots, it is clear that recombination rates can vary over many orders of magnitude at the fine scale. We present a method for the estimation of recombination rates in the presence of recombination hotspots. We demonstrate that the method is able to detect and accurately quantify recombination rate heterogeneity, and is a substantial improvement over a commonly used method. We then use the method to reanalyze genetic variation data from the HLA and MS32 regions of the human genome and demonstrate that the method is able to provide accurate rate estimates and simultaneously detect hotspots.

PubMed Disclaimer

Figures

Figure 1.

Figure 1.

Illustration of the priors of LDhat (A) and rhomap (B). Shown here are individual realizations of the priors for a 200-kb region. Note the difference in the _Y_-axis scales.

Figure 2.

Figure 2.

Deviation of the estimated total ρ from the simulated value. Rate estimates from the constant rate simulations (Simulation Study A) using LDhat and rhomap are shown in A and B, respectively. Rate estimates from the variable rate simulations (Simulation Study B) using LDhat and rhomap are shown in C and D, respectively.

Figure 3.

Figure 3.

Average recombination rate estimates from 100 simulated data sets. (A) Results from Simulation Study A with a constant recombination rate. (B) Results from Simulation Study C with an active central hotspot. Rate estimates from LDhat and rhomap are shown as thick red and blue lines, respectively. The simulated recombination profile is shown in black. The 2.5th and 97.5th percentiles of the estimated rates are shown in faded colors. Note that, for clarity, the constant rate simulation estimates are shown on a linear scale, whereas the hotspot simulation estimates are shown on a logarithmic scale.

Figure 4.

Figure 4.

Results from Simulation Study B. Scatter plot of simulated rate versus estimated rate for LDhat (A) and rhomap (B). Each point represents an estimate of recombination rate between two adjacent SNPs. A 250-point moving average is also shown.

Figure 5.

Figure 5.

Results from Simulation Study B. (A) Correlation coefficient between the log10 estimated rate and the log10 simulated rate for 100 data sets, as measured over SNP intervals. The correlation coefficients obtained using rate estimates from LDhat are shown on the vertical axis, and the coefficients obtained using rhomap are shown on the horizontal axis. (B) Using rhomap as a hotspot detection tool in the variable rate simulation study. This plot shows the power of rhomap to detect recombination hotspots (thick line) and the false-discovery rate (thin line). Hotspots were called if the average number of hotspots per sample per kb at a local maxima was above the threshold shown on the horizontal axis. The hotspot was considered to be correctly detected if it was within 1.5 kb of the location of a simulated hotspot. Otherwise, the hotspot was considered a false positive.

Figure 6.

Figure 6.

Output of rhomap for the HLA and MS32 regions. Plots A and C show the recombination rate estimates of the HLA and MS32 regions respectively, with the estimated rate in blue, and (sex-averaged) sperm typing rate in red. SNP locations are shown as red marks. Estimates from rhomap were converted to cM/Mb by assuming _N_e = 10,000. Also shown in plot C is the detail of the NID2a/b and MSTM1a/b estimates. Plots B and D show the average number of hotspots per sample per kb for the same regions.

Similar articles

Cited by

References

    1. Arnheim N., Calabrese P., Nordborg M., Calabrese P., Nordborg M., Nordborg M. Hot and cold spots of recombination in the human genome: The reason we should find them and how this can be achieved. Am. J. Hum. Genet. 2003;73:5–16. - PMC - PubMed
    1. Fearnhead P. Consistency of estimators of the population-scaled recombination rate. Theor. Popul. Biol. 2003;64:67–79. - PubMed
    1. Fearnhead P. SequenceLDhot: Detecting recombination hotspots. Bioinformatics. 2006;22:3061–3066. - PubMed
    1. Fearnhead P., Donnelly P., Donnelly P. Estimating recombination rates from population genetic data. Genetics. 2001;159:1299–1318. - PMC - PubMed
    1. Greenawalt D.M., Cui X., Wu Y., Lin Y., Wang H.Y., Luo M., Tereshchenko I.V., Hu G., Li J.Y., Chu Y., Cui X., Wu Y., Lin Y., Wang H.Y., Luo M., Tereshchenko I.V., Hu G., Li J.Y., Chu Y., Wu Y., Lin Y., Wang H.Y., Luo M., Tereshchenko I.V., Hu G., Li J.Y., Chu Y., Lin Y., Wang H.Y., Luo M., Tereshchenko I.V., Hu G., Li J.Y., Chu Y., Wang H.Y., Luo M., Tereshchenko I.V., Hu G., Li J.Y., Chu Y., Luo M., Tereshchenko I.V., Hu G., Li J.Y., Chu Y., Tereshchenko I.V., Hu G., Li J.Y., Chu Y., Hu G., Li J.Y., Chu Y., Li J.Y., Chu Y., Chu Y., et al. Strong correlation between meiotic crossovers and haplotype structure in a 2.5-Mb region on the long arm of chromosome 21. Genome Res. 2006;16:208–214. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources