Assessment of whole genome amplification-induced bias through high-throughput, massively parallel whole genome sequencing - PubMed (original) (raw)

Comparative Study

Assessment of whole genome amplification-induced bias through high-throughput, massively parallel whole genome sequencing

Robert Pinard et al. BMC Genomics. 2006.

Abstract

Background: Whole genome amplification is an increasingly common technique through which minute amounts of DNA can be multiplied to generate quantities suitable for genetic testing and analysis. Questions of amplification-induced error and template bias generated by these methods have previously been addressed through either small scale (SNPs) or large scale (CGH array, FISH) methodologies. Here we utilized whole genome sequencing to assess amplification-induced bias in both coding and non-coding regions of two bacterial genomes. Halobacterium species NRC-1 DNA and Campylobacter jejuni were amplified by several common, commercially available protocols: multiple displacement amplification, primer extension pre-amplification and degenerate oligonucleotide primed PCR. The amplification-induced bias of each method was assessed by sequencing both genomes in their entirety using the 454 Sequencing System technology and comparing the results with those obtained from unamplified controls.

Results: All amplification methodologies induced statistically significant bias relative to the unamplified control. For the Halobacterium species NRC-1 genome, assessed at 100 base resolution, the D-statistics from GenomiPhi-amplified material were 119 times greater than those from unamplified material, 164.0 times greater for Repli-G, 165.0 times greater for PEP-PCR and 252.0 times greater than the unamplified controls for DOP-PCR. For Campylobacter jejuni, also analyzed at 100 base resolution, the D-statistics from GenomiPhi-amplified material were 15 times greater than those from unamplified material, 19.8 times greater for Repli-G, 61.8 times greater for PEP-PCR and 220.5 times greater than the unamplified controls for DOP-PCR.

Conclusion: Of the amplification methodologies examined in this paper, the multiple displacement amplification products generated the least bias, and produced significantly higher yields of amplified DNA.

PubMed Disclaimer

Figures

Figure 1

Figure 1

A – D: Hompolymer coverage in Halobacterium sequence reads. A. A log2plot illustrating the total number of adenosine homopolymers sequenced in reads generated by control and amplified populations of Halobacterium species NRC-1 DNA. The data from the unamplified replicate population are shown in red, GenomiPhi in blue, Repli-G in orange, PEP in green and DOP in purple. B. As Figure 1A., but for Cytosine homopolymers. C. As Figure 1A., but for Guanine homopolymers. D. As Figure 1A., but for Thymine homopolymers.

Figure 2

Figure 2

A – D: Hompolymer coverage in Campylobacter jejuni sequence reads. A. A log2 plot illustrating the total number of adenosine homopolymers sequenced in reads generated by control and amplified populations of Campylobacter jejuni DNA. The data from the unamplified replicate population are shown in red, GenomiPhi in blue, Repli-G in orange, PEP in green and DOP in purple. B. As Figure 2A., but for Cytosine homopolymers. C. As Figure 2A., but for Guanine homopolymers. D. As Figure 2A., but for Thymine homopolymers.

Figure 3

Figure 3

A & B: Comparison of control sequence coverage versus repeat region location. Distribution of sequence coverage from unamplified control Halobacterium species NRC-1 population as determined by sequence based karyotyping at 100 base resolution relative to repeat region and chromosome location. A. Counts per bin are displayed above the X axis in red, repeat regions are shown below the X axis in black. The X axis displays the length of the genome in bases. The relative location of the three Halobacterium chromosomes are shown by the horizontal bars below the X axis, NRC1 is green, pNRC100 is gold, and pNRC200 is purple. B. As for Figure 3A, but for Campylobacter jejuni. No chromosome bars are included as C. jejuni is comprised of a single chromosome.

Figure 4

Figure 4

A & B: Empirical cumulative distribution functions. A. Empirical cumulative distribution function (ECDF) depicting the distributions of counts per bin for various control and amplified populations of Halobacterium species NRC-1 DNA. The ECDF represents the cumulative distribution of the number of counts per bin, reporting the cumulative proportion of bins with counts equal or less than the value on the X axis. The control cumulative fraction (black) is plotted against all WGA approaches; GenomiPhi in blue, Repli-G in orange, PEP in green and DOP in purple. Inset. Generic ECDF of two different distributions (red and blue), the D statistic is shown as black vertical line labelled "D". B as Figure 4A, but derived from Campylobacter jejuni.

Figure 5

Figure 5

A & B: Comparison of Halobacterium sequence coverage. Distribution of sequence coverage from unamplified reference, unamplified replicate and whole genome amplified Halobacterium species NRC-1 populations as determined by sequence based karyotyping at 100 base resolution. A. Distribution of the counts per bin (y-axis) across the length of the genome (X axis) of test populations relative to the unamplified reference population. The counts per bin from the unamplified reference population are depicted in black above the X axis, counts from the various test populations are inverted and shown below the X axis. The data from the unamplified replicate population is shown in red, GenomiPhi in blue, Repli-G in orange, PEP in green and DOP in purple. B. Log-log plot of the counts per bin from the unamplified reference population versus the various test populations. The comparison between the reference and the unamplified replicate population is shown in red, GenomiPhi in blue, Repli-G in orange, PEP in green and DOP in purple. A 45 degree black line is shown for comparison.

Figure 6

Figure 6

A & B: Comparison of Campylobacter sequence coverage. Distribution of sequence coverage from unamplified reference, unamplified replicate and whole genome amplified Campylobacter jejuni populations as determined by sequence based karyotyping at 100 base resolution. A. Distribution of the counts per bin (y-axis) across the length of the genome (X axis) of test populations relative to the unamplified reference population. The counts per bin from the unamplified reference population are depicted in black above the X axis, counts from the various test populations are inverted and shown below the X axis. The data from the unamplified replicate population is shown in red, GenomiPhi in blue, Repli-G in orange, PEP in green and DOP in purple. B. Log-log plot of the counts per bin from the unamplified reference population versus the various test populations. The comparison between the reference and the unamplified replicate population is shown in red, GenomiPhi in blue, Repli-G in orange, PEP in green and DOP in purple. A 45 degree black line is shown for comparison.

Similar articles

Cited by

References

    1. Andries K, Verhasselt P, Guillemont J, Gohlmann HW, Neefs JM, Winkler H, Van Gestel J, Timmerman P, Zhu M, Lee E, Williams P, de Chaffoy D, Huitric E, Hoffner S, Cambau E, Truffot-Pernot C, Lounis N, Jarlier V. A diarylquinoline drug active on the ATP synthase of Mycobacterium tuberculosis. Science. 2005;307:223–227. doi: 10.1126/science.1106753. - DOI - PubMed
    1. Ranade K, Chang MS, Ting CT, Pei D, Hsiao CF, Olivier M, Pesich R, Hebert J, Chen YD, Dzau VJ, Curb D, Olshen R, Risch N, Cox DR, Botstein D. High-throughput genotyping with single nucleotide polymorphisms. Genome Res. 2001;11:1262–1268. - PMC - PubMed
    1. Syvanen AC. Toward genome-wide SNP genotyping. Nat Genet. 2005;37 Suppl:S5–10. doi: 10.1038/ng1558. - DOI - PubMed
    1. Zheng S, Ma X, Buffler PA, Smith MT, Wiencke JK. Whole genome amplification increases the efficiency and validity of buccal cell genotyping in pediatric populations. Cancer Epidemiol Biomarkers Prev. 2001;10:697–700. - PubMed
    1. Dietmaier W, Hartmann A, Wallinger S, Heinmöller E, Kerner T, Endl E, Jauch KW, Hofstädter F, Rüschoff J. Multiple mutation analyses in single tumor cells with improved whole genome amplification. American Journal of Pathology. 1999;154:83–95. - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources