Using the FASTA Program to Search Protein and DNA Sequence Databases (original) (raw)

Abstract

As this volume illustrates, computers have become an integral tool in the analysis of DNA and protein sequence data. One of the most popular applications of computers in modern molecular biology is to characterize newly determined sequences by searching DNA and protein sequence databases. The FASTA* program (1,2) is widely used for such searches, because it is fast, sensitive, and readily available. FASTA is available as part of a package of programs that construct local and global sequence alignments. This chapter will describe a number of simple applications of FASTA and other programs in the FASTA package. This chapter focuses on the steps required to run the programs, rather than on the interpretation of the results of a FASTA search. For a more complete description of FASTA and related programs for identifying distantly related DNA and protein sequences, for evaluating the statistical significance of sequence similarities, and for identifying similar structures in DNA and protein sequences (see ref. 2).

References

Pearson, W. R. and Lipman, D. I. (1988) Improved tools for biological sequence comparison. Proc Natl Acad. Set. USA 85, 2444–2448.
Article CAS Google Scholar
Pearson, W. R. (1990) Rapid and sensitive sequence comparison with FASTP and FASTA, in Methods in Enzymology, vol 183 (Doolittle, R. F, ed.), Academic, New York, pp. 63–98
Google Scholar
Lipman, D. J. and Pearson, W. R. (1985) Rapid and sensitive protein similarity searches Science 227, 1435–1441.
Article PubMed CAS Google Scholar
Pearson, W. R. (1991) Searching protein sequence libraries: Comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms Genomics 11, 635–650.
Article PubMed CAS Google Scholar
Dayhoff, M., Schwartz, R. M., and Orcutt, B. C. (1978) A model of evolutionary change in proteins, in Atlas of Protein Sequence and Structure, vol. 5,supplement 3 (Dayhoff, M., ed.), National Biomedical Research Foundation, Silver Spring, MD, pp. 345–352.
Google Scholar
Doolittle, R. F, Feng, D. F., Johnson, M. S, and McClure, M. A (1986) Relationships of human protein sequences to those of other organisms Cold Spring Harb Symp Quant. Biol. 51, 447–455.
PubMed CAS Google Scholar
Smith, T. F and Waterman, M. S. (1981) Identification of common molecular subsequences. J. Mol Biol 147, 195–197.
Article PubMed CAS Google Scholar
Altschul, S. F., Gish, W., Miller, W., Myers, E. W, and Lipman, D. J (1990) A basic local alignment search tool. J. Mol. Biol 215, 403–410.
PubMed CAS Google Scholar
Waterman, M S. and Eggert, M (1987) A new algorithm for best subsequences alignment with application to tRNA-rRNA comparisons. J. Mol. Biol. 197, 723–728.
Article PubMed CAS Google Scholar
Huang, X., Hardison, R. C, and Miller, W. (1990) A space-efficient algorithm for local similarities. CABIOS 6, 373–381.
PubMed CAS Google Scholar
Huang, X and Miller, W. (1991) A time-efficient, linear-space local similarity algorithm. Adv. Appl. Math. 12, 337–357.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Biochemistry, University of Virginia, Charlottesville, VA
William R. Pearson

Authors

William R. Pearson
You can also search for this author inPubMed Google Scholar

Rights and permissions

Copyright information

About this protocol

Cite this protocol

Pearson, W.R. (1994). Using the FASTA Program to Search Protein and DNA Sequence Databases. In: Computer Analysis of Sequence Data. Methods in Molecular Biology, vol 25. Springer, Totowa, NJ. https://doi.org/10.1385/0-89603-276-0:365

Download citation

.RIS
.ENW
.BIB
DOI: https://doi.org/10.1385/0-89603-276-0:365
Publisher Name: Springer, Totowa, NJ
Print ISBN: 978-0-89603-276-7
Online ISBN: 978-1-59259-512-9
eBook Packages: Springer Protocols