Improved tools for biological sequence comparison - PubMed (original) (raw)
Comparative Study
Improved tools for biological sequence comparison
W R Pearson et al. Proc Natl Acad Sci U S A. 1988 Apr.
Abstract
We have developed three computer programs for comparisons of protein and DNA sequences. They can be used to search sequence data bases, evaluate similarity scores, and identify periodic structures based on local sequence similarity. The FASTA program is a more sensitive derivative of the FASTP program, which can be used to search protein or DNA sequence data bases and can compare a protein sequence to a DNA sequence data base by translating the DNA data base as it is searched. FASTA includes an additional step in the calculation of the initial pairwise similarity score that allows multiple regions of similarity to be joined to increase the score of related sequences. The RDF2 program can be used to evaluate the significance of similarity scores using a shuffling method that preserves local sequence composition. The LFASTA program can display all the regions of local similarity between two sequences with scores greater than a threshold, using the same scoring parameters and a similar alignment algorithm; these local similarities can be displayed as a "graphic matrix" plot or as individual alignments. In addition, these programs have been generalized to allow comparison of DNA or protein sequences based on a variety of alternative scoring matrices.
Similar articles
- Rapid and sensitive sequence comparison with FASTP and FASTA.
Pearson WR. Pearson WR. Methods Enzymol. 1990;183:63-98. doi: 10.1016/0076-6879(90)83007-v. Methods Enzymol. 1990. PMID: 2156132 - BLAST and FASTA similarity searching for multiple sequence alignment.
Pearson WR. Pearson WR. Methods Mol Biol. 2014;1079:75-101. doi: 10.1007/978-1-62703-646-7_5. Methods Mol Biol. 2014. PMID: 24170396 - Profile analysis: detection of distantly related proteins.
Gribskov M, McLachlan AD, Eisenberg D. Gribskov M, et al. Proc Natl Acad Sci U S A. 1987 Jul;84(13):4355-8. doi: 10.1073/pnas.84.13.4355. Proc Natl Acad Sci U S A. 1987. PMID: 3474607 Free PMC article. - Numerical characterization and similarity analysis of DNA sequences based on 2-D graphical representation of the characteristic sequences.
Li C, Wang J. Li C, et al. Comb Chem High Throughput Screen. 2003 Dec;6(8):795-9. doi: 10.2174/138620703771826900. Comb Chem High Throughput Screen. 2003. PMID: 14683485 Review.
Cited by
- Seven quick tips for gene-focused computational pangenomic analysis.
Bonnici V, Chicco D. Bonnici V, et al. BioData Min. 2024 Sep 3;17(1):28. doi: 10.1186/s13040-024-00380-2. BioData Min. 2024. PMID: 39227987 Free PMC article. - Seqrutinator: scrutiny of large protein superfamily sequence datasets for the identification and elimination of non-functional homologues.
Amalfitano A, Stocchi N, Atencio HM, Villarreal F, Ten Have A. Amalfitano A, et al. Genome Biol. 2024 Aug 26;25(1):230. doi: 10.1186/s13059-024-03371-y. Genome Biol. 2024. PMID: 39187866 Free PMC article. - Distinct evolutionary trajectories following loss of RNA interference in Cryptococcus neoformans.
Huang J, Larmore CJ, Priest SJ, Xu Z, Dietrich FS, Yadav V, Magwene PM, Sun S, Heitman J. Huang J, et al. bioRxiv [Preprint]. 2024 Aug 16:2024.08.15.608186. doi: 10.1101/2024.08.15.608186. bioRxiv. 2024. PMID: 39185155 Free PMC article. Preprint. - An ontology-based knowledge graph for representing interactions involving RNA molecules.
Cavalleri E, Cabri A, Soto-Gomez M, Bonfitto S, Perlasca P, Gliozzo J, Callahan TJ, Reese J, Robinson PN, Casiraghi E, Valentini G, Mesiti M. Cavalleri E, et al. Sci Data. 2024 Aug 22;11(1):906. doi: 10.1038/s41597-024-03673-7. Sci Data. 2024. PMID: 39174566 Free PMC article. - Impact of predictive selection of LbCas12a CRISPR RNAs upon on- and off-target editing rates in soybean.
Rymarquis L, Wu C, Hohorst D, Vega-Sanchez M, Mullen TE, Vemulapalli V, Smith DR. Rymarquis L, et al. Plant Direct. 2024 Aug 16;8(8):e627. doi: 10.1002/pld3.627. eCollection 2024 Sep. Plant Direct. 2024. PMID: 39157758 Free PMC article.
References
- Proc Natl Acad Sci U S A. 1979 Jul;76(7):3041 - PubMed
- Nucleic Acids Res. 1982 Jan 11;10(1):197-206 - PubMed
- Proc Natl Acad Sci U S A. 1983 Feb;80(3):726-30 - PubMed
- Science. 1985 Mar 22;227(4693):1435-41 - PubMed
- Science. 1981 Oct 9;214(4517):149-59 - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources