From micrograms to picograms: quantitative PCR reduces the material demands of high-throughput sequencing (original) (raw)

Comparative performance of the BGISEQ-500 versus Illumina HiSeq2500 sequencing platforms for palaeogenomic sequencing

Background: Ancient DNA research has been revolutionised following development of ‘Next Generation’ Sequencing platforms. Although a number of such platforms have been applied to ancient DNA samples, the Illumina series are the dominant choice today, mainly because of high production capacities and short read production. Recently a potentially attractive alternative platform for palaeogenomic data generation has been developed, the BGISEQ- 500, whose sequence output are comparable with the Illumina series. In this study, we modified the standard BGISEQ-500 library preparation specifically for use on degraded DNA, then directly compared the sequencing performance and data quality of the BGISEQ- 500 to the Illumina HiSeq2500 platform, on DNA extracted from eight historic and ancient dog and wolf samples. Results: The data generated was largely comparable between sequencing platforms, with no statistically significant difference observed for parameters including level (p=0.371) and average sequence length (p=0718) of endogenous nuclear DNA, sequence GC content (p=0.311), double stranded DNA damage rate (p=0.309), and sequence clonality (p=0.093). Small significant differences were found in single strand DNA damage rate (δS, slight lower for the BGISEQ-500, p=0.011) and the background rate of difference from the reference genome (θ, slightly higher for BGISEQ-500, p=0.012). This may result from the differences in amplification cycles used to PCR amplify the libraries. A significant difference was also observed in the mitochondrial DNA percentages recovered (p=0.018), although we believe this is likely a stochastic effect relating to the extremely low levels of mitochondria that were sequenced from three of the samples with overall very low levels of endogenous DNA. Conclusions: Although we acknowledge our analyses were limited to animal material, our observations suggest that the BGISEQ-500 holds the potential to represent valid and potentially valuable alternative platform for palaeogenomic data generation, that is worthy of future exploration by those interested in the sequencing and analysis of degraded DNA.

Massive influence of DNA isolation and library preparation approaches on palaeogenomic sequencing data

2016

ABSTRACTThe ability to access genomic information from ancient samples has provided many important biological insights. Generating such palaeogenomic data requires specialised methodologies, and a variety of procedures for all stages of sample preparation have been proposed. However, the specific effects and biases introduced by alternative laboratory procedures is insufficiently understood. Here, we investigate the effects of three DNA isolation and two library preparation protocols on palaeogenomic data obtained from four Pleistocene subfossil bones. We find that alternative methodologies can significantly and substantially affect total DNA yield, the mean length and length distribution of recovered fragments, nucleotide composition, and the total amount of usable data generated. Furthermore, we also detect significant interaction effects between these stages of sample preparation on many of these factors. Effects and biases introduced in the laboratory can be sufficient to confou...

Comparative performance of the BGISEQ-500 vs Illumina HiSeq2500 sequencing platforms for palaeogenomic sequencing

GigaScience, 2017

Ancient DNA research has been revolutionized following development of next-generation sequencing platforms. Although a number of such platforms have been applied to ancient DNA samples, the Illumina series are the dominant choice today, mainly because of high production capacities and short read production. Recently a potentially attractive alternative platform for palaeogenomic data generation has been developed, the BGISEQ-500, whose sequence output are comparable with the Illumina series. In this study, we modified the standard BGISEQ-500 library preparation specifically for use on degraded DNA, then directly compared the sequencing performance and data quality of the BGISEQ-500 to the Illumina HiSeq2500 platform on DNA extracted from 8 historic and ancient dog and wolf samples. The data generated were largely comparable between sequencing platforms, with no statistically significant difference observed for parameters including level (P = 0.371) and average sequence length (P = 0718) of endogenous nuclear DNA, sequence GC content (P = 0.311), double-stranded DNA damage rate (v. 0.309), and sequence clonality (P = 0.093). Small significant differences were found in single-strand DNA damage rate (δS; slightly lower for the BGISEQ-500, P = 0.011) and the background rate of difference from the reference genome (θ ; slightly higher for BGISEQ-500, P = 0.012). This may result from the differences in amplification cycles used to polymerase chain reaction-amplify the libraries. A significant difference was also observed in the mitochondrial DNA percentages recovered (P = 0.018), although we believe this is likely a stochastic effect relating to the extremely low levels of mitochondria that were sequenced from 3 of the samples with overall very low levels of endogenous DNA. Although we acknowledge that our analyses were limited to animal material, our observations suggest that the BGISEQ-500 holds the potential to represent a valid and potentially valuable alternative platform for palaeogenomic data generation that is worthy of future exploration by those interested in the sequencing and analysis of degraded DNA.

Metagenomics to paleogenomics: large-scale sequencing of mammoth DNA

Science (New York, N.Y.), 2006

We sequenced 28 million base pairs of DNA in a metagenomics approach, using a woolly mammoth (Mammuthus primigenius) sample from Siberia. As a result of exceptional sample preservation and the use of a recently developed emulsion polymerase chain reaction and pyrosequencing technique, 13 million base pairs (45.4%) of the sequencing reads were identified as mammoth DNA. Sequence identity between our data and African elephant (Loxodonta africana) was 98.55%, consistent with a paleontologically based divergence date of 5 to 6 million years. The sample includes a surprisingly small diversity of environmental DNAs. The high percentage of endogenous DNA recoverable from this single mammoth would allow for completion of its genome, unleashing the field of paleogenomics.

Sequencing the nuclear genome of the extinct woolly mammoth

Nature, 2008

In 1994, two independent groups extracted DNA from several Pleistocene epoch mammoths and noted differences among individual specimens 1,2 . Subsequently, DNA sequences have been published for a number of extinct species. However, such ancient DNA is often fragmented and damaged 3 , and studies to date have typically focused on short mitochondrial sequences, never yielding more than a fraction of a per cent of any nuclear genome. Here we describe 4.17 billion bases (Gb) of sequence from several mammoth specimens, 3.3 billion (80%) of which are from the woolly mammoth (Mammuthus primigenius) genome and thus comprise an extensive set of genome-wide sequence from an extinct species. Our data support earlier reports 4 that elephantid genomes exceed 4 Gb. The estimated divergence rate between mammoth and African elephant is half of that between human and chimpanzee. The observed number of nucleotide differences between two particular mammoths was approximately one-eighth of that between one of them and the African elephant, corresponding to a separation between the mammoths of 1.5-2.0 Myr. The estimated probability that orthologous elephant and mammoth amino acids differ is 0.002, corresponding to about one residue per protein.

Pulling out the 1%: Whole-Genome Capture for the Targeted Enrichment of Ancient DNA Sequencing Libraries

The American Journal of Human Genetics, 2013

Most ancient specimens contain very low levels of endogenous DNA, precluding the shotgun sequencing of many interesting samples because of cost. Ancient DNA (aDNA) libraries often contain <1% endogenous DNA, with the majority of sequencing capacity taken up by environmental DNA. Here we present a capture-based method for enriching the endogenous component of aDNA sequencing libraries. By using biotinylated RNA baits transcribed from genomic DNA libraries, we are able to capture DNA fragments from across the human genome. We demonstrate this method on libraries created from four Iron Age and Bronze Age human teeth from Bulgaria, as well as bone samples from seven Peruvian mummies and a Bronze Age hair sample from Denmark. Prior to capture, shotgun sequencing of these libraries yielded an average of 1.2% of reads mapping to the human genome (including duplicates). After capture, this fraction increased substantially, with up to 59% of reads mapped to human and enrichment ranging from 6-to 159-fold. Furthermore, we maintained coverage of the majority of regions sequenced in the precapture library. Intersection with the 1000 Genomes Project reference panel yielded an average of 50,723 SNPs (range 3,062-147,243) for the postcapture libraries sequenced with 1 million reads, compared with 13,280 SNPs (range 217-73,266) for the precapture libraries, increasing resolution in population genetic analyses. Our whole-genome capture approach makes it less costly to sequence aDNA from specimens containing very low levels of endogenous DNA, enabling the analysis of larger numbers of samples.

Next Generation Sequencing of Ancient DNA: Requirements, Strategies and Perspectives

Genes, 2010

The invention of next-generation-sequencing has revolutionized almost all fields of genetics, but few have profited from it as much as the field of ancient DNA research. From its beginnings as an interesting but rather marginal discipline, ancient DNA research is now on its way into the centre of evolutionary biology. In less than a year from its invention next-generation-sequencing had increased the amount of DNA sequence data available from extinct organisms by several orders of magnitude. Ancient DNA research is now not only adding a temporal aspect to evolutionary studies and allowing for the observation of evolution in real time, it also provides important data to help understand the origins of our own species. Here we review progress that has been made in nextgeneration-sequencing of ancient DNA over the past five years and evaluate sequencing strategies and future directions.

Direct multiplex sequencing (DMPS) - a novel method for targeted high-throughput sequencing of ancient and highly degraded DNA

Genome Research, 2009

Although the emergence of high-throughput sequencing technologies has enabled whole-genome sequencing from extinct organisms, little progress has been made in accelerating targeted sequencing from highly degraded DNA. Here, we present a novel and highly sensitive method for targeted sequencing of ancient and degraded DNA, which couples multiplex PCR directly with sample barcoding and high-throughput sequencing. Using this approach, we obtained a 96% complete mitochondrial genome data set from 31 cave bear (Ursus spelaeus) samples using only two 454 Life Sciences (Roche) GS FLX runs. In contrast to previous studies relying only on short sequence fragments, the overlapping portion of our data comprises almost 10 kb of replicated mitochondrial genome sequence, allowing for the unambiguous differentiation of three major cave bear clades. Our method opens up the opportunity to simultaneously generate many kilobases of overlapping sequence data from large sets of difficult samples, such as museum specimens, medical collections, or forensic samples. Embedded in our approach, we present a new protocol for the construction of barcoded sequencing libraries, which is compatible with all current high-throughput technologies and can be performed entirely in plate setup.