Improved sensitivity of biological sequence database searches - PubMed (original) (raw)
Improved sensitivity of biological sequence database searches
D L Brutlag et al. Comput Appl Biosci. 1990 Jul.
Abstract
We have increased the sensitivity of DNA and protein sequence database searches by allowing similar but non-identical amino acids or nucleotides to match. In addition, one can match k-tuples or words instead of matching individual residues in order to speed the search. A matching matrix species which k-tuples match each other. The matching matrix can be calculated from a similarity matrix of amino acids and a threshold of similarity required for matching. This permits amino acid similarity matrices or replacement matrices (PAM matrices) to be used in the first step of a sequence comparison rather than in a secondary scoring phase. The concept of matching non-identical k-tuples also increases the power of DNA database searches. For example, a matrix that specifies that any 3-tuple in a DNA sequence can match any other 3-tuple encoding the same amino acid permits a DNA database search using a DNA query sequence for regions that would encode a similar amino acid sequence.
Similar articles
- Data bank homology search algorithm with linear computation complexity.
Strelets VB, Ptitsyn AA, Milanesi L, Lim HA. Strelets VB, et al. Comput Appl Biosci. 1994 Jun;10(3):319-22. doi: 10.1093/bioinformatics/10.3.319. Comput Appl Biosci. 1994. PMID: 7922689 - SSAHA: a fast search method for large DNA databases.
Ning Z, Cox AJ, Mullikin JC. Ning Z, et al. Genome Res. 2001 Oct;11(10):1725-9. doi: 10.1101/gr.194201. Genome Res. 2001. PMID: 11591649 Free PMC article. - Compact encoding strategies for DNA sequence similarity search.
States DJ, Agarwal P. States DJ, et al. Proc Int Conf Intell Syst Mol Biol. 1996;4:211-7. Proc Int Conf Intell Syst Mol Biol. 1996. PMID: 8877521 - The EMBL Nucleotide Sequence Database. Contributing and accessing data.
Hingamp P, van den Broek AE, Stoesser G, Baker W. Hingamp P, et al. Mol Biotechnol. 1999 Oct;12(3):255-67. doi: 10.1385/MB:12:3:255. Mol Biotechnol. 1999. PMID: 10631682 Review. - Protein database searches using compositionally adjusted substitution matrices.
Altschul SF, Wootton JC, Gertz EM, Agarwala R, Morgulis A, Schäffer AA, Yu YK. Altschul SF, et al. FEBS J. 2005 Oct;272(20):5101-9. doi: 10.1111/j.1742-4658.2005.04945.x. FEBS J. 2005. PMID: 16218944 Free PMC article. Review.
Cited by
- Patterns in protein primary sequences: classification, display and analysis.
Saurugger PN, Metfessel BA. Saurugger PN, et al. Proc Annu Symp Comput Appl Med Care. 1991:399-403. Proc Annu Symp Comput Appl Med Care. 1991. PMID: 1807631 Free PMC article. - Cross-validation of protein structural class prediction using statistical clustering and neural networks.
Metfessel BA, Saurugger PN, Connelly DP, Rich SS. Metfessel BA, et al. Protein Sci. 1993 Jul;2(7):1171-82. doi: 10.1002/pro.5560020712. Protein Sci. 1993. PMID: 8358300 Free PMC article. - Identification of B cells as a major site for cyprinid herpesvirus 3 latency.
Reed AN, Izume S, Dolan BP, LaPatra S, Kent M, Dong J, Jin L. Reed AN, et al. J Virol. 2014 Aug;88(16):9297-309. doi: 10.1128/JVI.00990-14. Epub 2014 Jun 4. J Virol. 2014. PMID: 24899202 Free PMC article. - Evolution of the major histocompatibility complex: molecular cloning of major histocompatibility complex class I from the amphibian Xenopus.
Flajnik MF, Canel C, Kramer J, Kasahara M. Flajnik MF, et al. Proc Natl Acad Sci U S A. 1991 Jan 15;88(2):537-41. doi: 10.1073/pnas.88.2.537. Proc Natl Acad Sci U S A. 1991. PMID: 1703301 Free PMC article. - MIF2 is required for mitotic spindle integrity during anaphase spindle elongation in Saccharomyces cerevisiae.
Brown MT, Goetsch L, Hartwell LH. Brown MT, et al. J Cell Biol. 1993 Oct;123(2):387-403. doi: 10.1083/jcb.123.2.387. J Cell Biol. 1993. PMID: 8408221 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Other Literature Sources
Miscellaneous