An optimized protocol for analysis of EST sequences - PubMed (original) (raw)

An optimized protocol for analysis of EST sequences

F Liang et al. Nucleic Acids Res. 2000.

Abstract

The vast body of Expressed Sequence Tag (EST) data in the public databases provide an important resource for comparative and functional genomics studies and an invaluable tool for the annotation of genomic sequences. We have developed a rigorous protocol for reconstructing the sequences of transcribed genes from EST and gene sequence fragments. A key element in developing this protocol has been the evaluation of a number of sequence assembly programs to determine which most faithfully reproduce transcript sequences from EST data. The TIGR Gene Indices constructed using this protocol for human, mouse, rat and a variety of other plant and animal models have demonstrated their utility in a variety of applications and are freely available to the scientific research community.

PubMed Disclaimer

Figures

Figure 1

Figure 1

DNA sequencing base call error probability. Error probability distribution adapted from Ewing and Green (12) used to simulate systematic base call errors.

Figure 2

Figure 2

CLUSTAL W (17) alignment of consensus sequence assemblies for the rat cytochrome c oxidase gene produced by Phrap, CAP3, TA-EST and TIGR Assembler.

Figure 3

Figure 3

Consensus sequence errors. Plot of A-scores for the best consensus assemblies produced by Phrap, CAP3, TA-EST and TIGR Assembler (TA) using simulated data for various error rates at 5× and 50× sequence coverage.

Figure 4

Figure 4

Error source distribution and normalized A-score for assemblies of 73 known genes. Consensus sequence error classification for Phrap, CAP3, TA-EST and TIGR Assembler using EST sequences containing 5% errors at various depths of coverage.

Figure 5

Figure 5

DNA sequencing base call error probability. The total number of errors, classified by type, in the best assembly produced by the four assemblers and the normalized A-score for 73 known genes.

Similar articles

Cited by

References

    1. Adams M.D., Kelley,J.M., Gocayne,J.D., Dubnick,M., Polymeropoulos,M.H., Xiao,H., Merril,C.R., Wu,A., Olde,B., Moreno,R.F. et al. (1991) Science, 252, 1651–1661. - PubMed
    1. Adams M.D., Kerlavage,A.R., Fleischmann,R.D., Fuldner,R.A., Bult,C.J., Lee,N.H., Kirkness,E.F., Weinstock,K.G., Gocayne,J.D., White,O. et al. (1995) Nature, 377, 3–174. - PubMed
    1. Hudson T.J., Stein,L.D., Gerety,S.S., Ma,J., Castle,A.B., Silva,J., Slonim,D.K., Baptista,R., Kruglyak,L., Xu,S.H. et al. (1995) Science, 270, 1945–1954. - PubMed
    1. Schuler G.D., Boguski,M.S., Stewart,E.A., Stein,L.D., Gyapay,G., Rice,K., White,R.E., Rodriguez-Tome,P., Aggarwal,A., Bajorek,E. et al. (1996) Science, 274, 540–546. - PubMed
    1. Bouck J., Yu,W., Gibbs,R. and Worley,K. (1999) Trends Genet., 15, 159–162. - PubMed

Publication types

MeSH terms

LinkOut - more resources