Multiplexed shotgun genotyping for rapid and efficient genetic mapping - PubMed (original) (raw)

Multiplexed shotgun genotyping for rapid and efficient genetic mapping

Peter Andolfatto et al. Genome Res. 2011 Apr.

Abstract

We present a new approach to genotyping based on multiplexed shotgun sequencing that can identify recombination breakpoints in a large number of individuals simultaneously at a resolution sufficient for most mapping purposes, such as quantitative trait locus (QTL) mapping and mapping of induced mutations. We first describe a simple library construction protocol that uses just 10 ng of genomic DNA per individual and makes the approach accessible to any laboratory with standard molecular biology equipment. Sequencing this library results in a large number of sequence reads widely distributed across the genomes of multiplexed bar-coded individuals. We develop a Hidden Markov Model to estimate ancestry at all genomic locations in all individuals using these data. We demonstrate the utility of the approach by mapping a dominant marker allele in D. simulans to within 105 kb of its true position using 96 F1-backcross individuals genotyped in a single lane on an Illumina Genome Analyzer. We further demonstrate the utility of our method by genetically mapping more than 400 previously unassembled D. simulans contigs to linkage groups and by evaluating the quality of targeted introgression lines. At this level of multiplexing and divergence between strains, our method allows estimation of recombination breakpoints to a median of 38-kb intervals. Our analysis suggests that higher levels of multiplexing and/or use of strains with lower levels of divergence are practicable.

PubMed Disclaimer

Figures

Figure 1.

Figure 1.

The experimental and bioinformatic pipeline for MSG. (1) Genomic DNA is fragmented with a restriction enzyme (RE) that leaves “sticky ends.” (2) Individual bar-coded adaptors are ligated to these restriction fragments. (3) Samples are pooled and, (4) the ligation products are size selected, PCR-amplified, and (5) sequenced on an Illumina Genome Analyzer. (6) Reads from the sequencing run are parsed based by barcode. (7) Each read is mapped to each of two parental genomes (indicated as red and blue, respectively). (8) Ancestry of chromosomal segments (blue: homozygous for parent 1; red: homozygous for parent 2; no color: heterozygous) is estimated using a Hidden Markov Model (HMM). (9) Genotypes and recombination breakpoints are used in downstream analyses, such as QTL mapping.

Figure 2.

Figure 2.

Genome-wide ancestry assignment for a representative individual. (A) The ancestry states are shown for all major chromosome arms for a representative (male) individual progeny from a (D. sechellia/D. simulans) F1 X D. sechellia backcross experiment. The posterior probability that a region is homozygous for D. simulans (red) or for D. sechellia (blue) ancestry is plotted along the y axis. A high probability of heterozygous ancestry is indicated as a solid black line across the center of the plot. (B–D) Closer examination of three breakpoints illustrates typical variation in breakpoint resolution. Gold shading represents the 95% confidence bounds on the position of crossovers and the coordinates for each of these bounds are shown at the bottom.

Figure 3.

Figure 3.

Resolution of recombination breakpoints. (A) A histogram of 373 inferred recombination breakpoint intervals (coordinates where posterior probability of a given ancestry switches from ≥95% to ≤5%) for our backcross experiment. The red asterisks indicate the median breakpoint resolution in our experiment (96-plex, 2% divergence). (B,C) Box and whisker plots of the medians of 100 subsamples of the data to examine effects of (B) increased multiplexing and (C) decreased divergence between the strains being crossed.

Figure 4.

Figure 4.

QTL map of the location of a dominant marker segregating in the reported backcross experiment. The inset illustrates below the estimated ancestries for all individuals between genomic locations 5.5 Mb and 8.5 Mb on the X chromosome and, above, the LOD profile. Individual ancestry estimates are sorted into individuals without EYFP above and with EYFP below. Regions with posterior probabilities close to 1 of homozygous D. simulans are coded blue, and homozygous D. sechellia are coded red. Posterior probabilities between 0 and 1 are coded with colors intermediate between blue and red.

Figure 5.

Figure 5.

Three representative individuals from an experiment involving targeted introgression of D. simulans genomic regions into a D. sechellia genomic background. The color scheme is the same as in Figure 2. Individuals in A and B carry an introgression of a genomic region on 3R that was targeted with a dominant marker located at 9,390,800 bp on chromosome 3R (DL Stern, unpubl.), but also carry regions with D. simulans ancestry on the X and the tip of 2L, respectively (arrows). (C) Introgression of a region on 2L that was targeted with a dominant marker located at 5,926,416 bp on chromosome 2L (DL Stern, unpubl.) with no residual regions of D. simulans ancestry.

Similar articles

Cited by

References

    1. Ashburner M 1989. Drosophila: A Laboratory Handbook. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY
    1. Baird NA, Etter PD, Atwood TS, Currey MC, Shiver AL, Lewis ZA, Selker EU, Cresko WA, Johnson EA 2008. Rapid SNP discovery and genetic mapping using sequenced RAD markers. PLoS ONE 3: e3376 doi: 10.1371/journal.pone.0003376 - PMC - PubMed
    1. Chen B, Chu T, Harms E, Gergen JP, Strickland S 1998. Mapping of Drosophila mutations using site-specific male recombination. Genetics 149: 157–163 - PMC - PubMed
    1. Clark AG, Eisen MB, Smith DR, Bergman CM, Oliver B, Markow TA, Kaufman TC, Kellis M, Gelbart W, Iyer VN, et al. 2007. Evolution of genes and genomes on the Drosophila phylogeny. Nature 450: 203–218 - PubMed
    1. Coop G, Wen X, Ober C, Pritchard JK, Przeworski M 2008. High-resolution mapping of crossovers reveals extensive variation in fine-scale recombination patterns among humans. Science 319: 1395–1398 - PubMed

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources