Finishing genomes with limited resources: lessons from an ensemble of microbial genomes - PubMed (original) (raw)

Finishing genomes with limited resources: lessons from an ensemble of microbial genomes

Niranjan Nagarajan et al. BMC Genomics. 2010.

Abstract

While new sequencing technologies have ushered in an era where microbial genomes can be easily sequenced, the goal of routinely producing high-quality draft and finished genomes in a cost-effective fashion has still remained elusive. Due to shorter read lengths and limitations in library construction protocols, shotgun sequencing and assembly based on these technologies often results in fragmented assemblies. Correspondingly, while draft assemblies can be obtained in days, finishing can take many months and hence the time and effort can only be justified for high-priority genomes and in large sequencing centers. In this work, we revisit this issue in light of our own experience in producing finished and nearly-finished genomes for a range of microbial species in a small-lab setting. These genomes were finished with surprisingly little investments in terms of time, computational effort and lab work, suggesting that the increased access to sequencing might also eventually lead to a greater proportion of finished genomes from small labs and genomics cores.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Summary of the finishing effort for A. aphrophilus. As can be seen from the figure much of the finishing effort (PCR experiments indicated by bars in the outermost ring) were devoted to disambiguating the neighborhood of the rRNA operon.

Figure 2

Figure 2

R. prowazekii Contig Graph. Note that the comments for Figure 3 are also valid here. This graph can be resolved into a unique in silico reconstruction of the genome.

Figure 3

Figure 3

Partial Contig Graph of Y. kristensenii. The pointed boxes represent contigs while the edges mark the presence of reads that span the corresponding contigs. The arrows on both ends of an edge indicate the orientation of the adjacent contigs. An arrow "out" of a contig indicates that the end of the contig is adjacent and an arrow "in" indicates that the beginning of the contig is adjacent.

Figure 4

Figure 4

AMOS-Hybrid pipeline. Circles are used to represent input/output and intermediate datasets. Names in parentheses refer to the programs used to perform the corresponding tasks in the boxes.

Figure 5

Figure 5

The optical mapping process. To generate a whole-genome optical map, DNA is sheared into fragments that are stretched and fixed onto an optical mapping surface and then digested using a restriction enzyme. The resulting pieces are optically analyzed and assembled into a genome-wide map.

References

    1. Parkhill J. In defense of complete genomes. Nature Biotechnology. 2000;18:493–494. doi: 10.1038/75346. - DOI - PubMed
    1. Fraser C, Eisen J, Nelson K, Paulsen IT, Salzberg S. The Value of Complete Microbial Genome Sequencing (You Get What You Pay For) J Bact. 2002;183:6403–6405. - PMC - PubMed
    1. Branscomb1 E, Predki P. On the High Value of Low Standards. J Bact. 2002;183:6406–6409. - PMC - PubMed
    1. Tettelin H, Radune D, Kasif S, Khouri H, Salzberg S. Optimized Multiplex PCR: efficiently closing a whole-genome shotgun sequencing project. Genomics. 1999;62(3):500–507. doi: 10.1006/geno.1999.6048. - DOI - PubMed
    1. Bonaventura MD, DeSalle R, Pop M, Nagarajan N, Figurski DH, Fine DH, Kaplan JB, Planet PJ. Complete Genome Sequence of Aggregatibacter (Haemophilus) aphrophilus NJ8700. J Bacteriol. 2009;191(14):4693–4694. doi: 10.1128/JB.00447-09. - DOI - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources