The Neandertal genome and ancient DNA authenticity - PubMed (original) (raw)

Review

The Neandertal genome and ancient DNA authenticity

Richard E Green et al. EMBO J. 2009.

Abstract

Recent advances in high-thoughput DNA sequencing have made genome-scale analyses of genomes of extinct organisms possible. With these new opportunities come new difficulties in assessing the authenticity of the DNA sequences retrieved. We discuss how these difficulties can be addressed, particularly with regard to analyses of the Neandertal genome. We argue that only direct assays of DNA sequence positions in which Neandertals differ from all contemporary humans can serve as a reliable means to estimate human contamination. Indirect measures, such as the extent of DNA fragmentation, nucleotide misincorporations, or comparison of derived allele frequencies in different fragment size classes, are unreliable. Fortunately, interim approaches based on mtDNA differences between Neandertals and current humans, detection of male contamination through Y chromosomal sequences, and repeated sequencing from the same fossil to detect autosomal contamination allow initial large-scale sequencing of Neandertal genomes. This will result in the discovery of fixed differences in the nuclear genome between Neandertals and current humans that can serve as future direct assays for contamination. For analyses of other fossil hominins, which may become possible in the future, we suggest a similar 'boot-strap' approach in which interim approaches are applied until sufficient data for more definitive direct assays are acquired.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no conflict of interest.

Figures

Figure 1

Figure 1

Estimates of human mtDNA contamination in Neandertal extracts. DNA extracts of Neandertal bones contain a large excess of microbial DNA (brown), at most a few percent of Neandertal DNA (blue) and generally variable amounts of contaminating DNA from current humans (red). Traditionally, contamination has been assayed through PCR directly from DNA extract from fossil bone (left lower panel). Accumulation of large numbers of reads from high-throughput sequencing allows a direct estimate of mtDNA contamination in the sequencing library (right lower panel). Once human/Neandertal diagnostic nuclear genome positions are learned, this strategy can be extended to nuclear DNA sequences.

Figure 2

Figure 2

Lengths of Neandertal and human mtDNA fragments. Distributions of mtDNA fragments carrying Neandertal diagnostic positions are shown in blue for three Neandertal fossils. Each red dot represents a single contaminating human mtDNA fragment of the indicated length (data from Briggs et al, 2009).

Figure 3

Figure 3

Neandertal/human divergence estimated from sequences of increasing length and score filtering. (A) Sequences in each length bin were used to calculate human/Neandertal divergence, given as the percentage of the human lineage back to the human/chimpanzee common ancestor in which the Neandertal sequences diverged. Sequences were filtered for uniqueness in the human and chimpanzee genomes by comparing the best alignment score to the second best score. In red are sequences whose best alignments are at least 1-bit better than the second best, in green with a difference of 5 bits or more. Bars show the 95% confidence interval from 1000 bootstrap replicates of the sequences in each bin. (B) Percentage of the sequences in each bin removed when increasing the alignment score filter from 1 to 5 bits. Shorter sequences are more likely to be removed by stricter filtering as they carry less information to place them uniquely in the human and chimpanzee genomes.

Figure 4

Figure 4

Fraction of human polymorphic positions carrying derived alleles in Neandertal and human DNA sequences. (A) Neandertal sequences of increasing length that overlap human polymorphic positions were assessed for having the derived or ancestral (chimpanzee-like) allele. Blue points are for Neandertal data, red points for the corresponding sequences in the human reference genome (hg18). (B) Sequences of length 60–78 nucleotides were split in half and re-analysed (‘30–39'). Derived alleles are preferentially lost when fragments size is reduced.

Similar articles

Cited by

References

    1. Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, Brown CG, Hall KP, Evers DJ, Barnes CL, Bignell HR, Boutell JM, Bryant J, Carter RJ, Keira Cheetham R, Cox AJ, Ellis DJ, Flatbush MR, Gormley NA, Humphray SJ, Irving LJ et al. (2008) Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456: 53–59 - PMC - PubMed
    1. Briggs AW, Good JM, Green RE, Krause J, Maricic T, Stenzel U, Lalueza-Fox C, Pavao R, Brajkovic D, Kucan Z, Gusic I, Schmitz RW, Doronichev VB, Golovanova LV, de la Resilla M, Fortea J, Rosas A, Pääbo S (2009) Targeted retrieval and analysis of five Neandertal mtDNA genomes. Science 325: 318–321 - PubMed
    1. Briggs AW, Stenzel U, Johnson PL, Green RE, Kelso J, Prufer K, Meyer M, Krause J, Ronan MT, Lachmann M, Pääbo S (2007) Patterns of damage in genomic DNA sequences from a Neandertal. Proc Natl Acad Sci USA 104: 14616–14621 - PMC - PubMed
    1. Brotherton P, Endicott P, Sanchez JJ, Beaumont M, Barnett R, Austin J, Cooper A (2007) Novel high-resolution characterization of ancient DNA reveals C > U-type base modification events as the sole cause of post mortem miscoding lesions. Nucleic Acids Res 35: 5717–5728 - PMC - PubMed
    1. Brown P, Sutikna T, Morwood MJ, Soejono RP, Jatmiko, Saptomo EW, Due RA (2004) A new small-bodied hominin from the Late Pleistocene of Flores, Indonesia. Nature 431: 1055–1061 - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources