PacBio Sequencing and Its Applications - PubMed (original) (raw)
Review
PacBio Sequencing and Its Applications
Anthony Rhoads et al. Genomics Proteomics Bioinformatics. 2015 Oct.
Abstract
Single-molecule, real-time sequencing developed by Pacific BioSciences offers longer read lengths than the second-generation sequencing (SGS) technologies, making it well-suited for unsolved problems in genome, transcriptome, and epigenetics research. The highly-contiguous de novo assemblies using PacBio sequencing can close gaps in current reference assemblies and characterize structural variation (SV) in personal genomes. With longer reads, we can sequence through extended repetitive regions and detect mutations, many of which are associated with diseases. Moreover, PacBio transcriptome sequencing is advantageous for the identification of gene isoforms and facilitates reliable discoveries of novel genes and novel isoforms of annotated genes, due to its ability to sequence full-length transcripts or fragments with significant lengths. Additionally, PacBio's sequencing technique provides information that is useful for the direct detection of base modifications, such as methylation. In addition to using PacBio sequencing alone, many hybrid sequencing strategies have been developed to make use of more accurate short reads in conjunction with PacBio long reads. In general, hybrid sequencing strategies are more affordable and scalable especially for small-size laboratories than using PacBio Sequencing alone. The advent of PacBio sequencing has made available much information that could not be obtained via SGS alone.
Keywords: De novo assembly; Gene isoform detection; Hybrid sequencing; Methylation; Third-generation sequencing.
Copyright © 2015 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.
Figures
Figure 1
SMRTbell template Hairpin adaptors (green) are ligated to the end of a double-stranded DNA molecule (yellow and purple), forming a closed circle. The polymerase (gray) is anchored to the bottom of a ZMW and incorporates bases into the read strand (orange). The image is adapted from with permission from the Oxford University Press.
Figure 2
A single SMRT cell Each SMRT cell contains 150,000 ZMWs. Approximately 35,000–75,000 of these wells produce a read in a run lasting 0.5–4 h, resulting in 0.5–1 Gb of sequence. The image is adapted with permission from Pacific Biosciences . ZMW, zero-mode waveguide.
Figure 3
Sequencing via light pulses A. A SMRTbell (gray) diffuses into a ZMW, and the adaptor binds to a polymerase immobilized at the bottom. B. Each of the four nucleotides is labeled with a different fluorescent dye (indicated in red, yellow, green, and blue, respectively for G, C, T, and A) so that they have distinct emission spectrums. As a nucleotide is held in the detection volume by the polymerase, a light pulse is produced that identifies the base. (1) A fluorescently-labeled nucleotide associates with the template in the active site of the polymerase. (2) The fluorescence output of the color corresponding to the incorporated base (yellow for base C as an example here) is elevated. (3) The dye-linker-pyrophosphate product is cleaved from the nucleotide and diffuses out of the ZMW, ending the fluorescence pulse. (4) The polymerase translocates to the next position. (5) The next nucleotide associates with the template in the active site of the polymerase, initiating the next fluorescence pulse, which corresponds to base A here. The figure is adapted from with permission from The American Association for the Advancement of Science.
Figure 4
PacBio RS II read length distribution using P6-C4 chemistry Data are based on a 20 kb size-selected E. coli library using a 4-h movie. Each SMRT cell produces 0.5–1 billion bases. The P6-C4 chemistry is currently the most advanced sequencing chemistry offered by PacBio. The figure is adapted with permission from Pacific Biosciences .
Figure 5
Detection of methylated bases using PacBio sequencing PacBio sequencing can detect modified bases, including m6A (also known as 6mA), by analyzing variation in the time between base incorporations in the read strand. The figure is adapted with permission from Pacific Biosciences . a.u. stands for arbitrary unit.
References
- Schadt E.E., Turner S., Kasarskis A. A window into third-generation sequencing. Hum Mol Genet. 2010;19:R227–R240. - PubMed
- Pacific Biosciences. Media Kit, <http://www.pacb.com/company/news-events/media-resources/page/3/> (May 19, 2015, date last accessed).
- Eid J., Fehr A., Gray J., Luong K., Lyle J., Otto G. Real-time DNA sequencing from single polymerase molecules. Science. 2009;323:133–138. - PubMed
- AllSeq. Pacific Biosciences, <http://allseq.com/knowledgebank/sequencing-platforms/pacific-biosciences> (April 14, 2015, date last accessed).
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases