Embedding strategies for effective use of information from multiple sequence alignments - PubMed (original) (raw)

Embedding strategies for effective use of information from multiple sequence alignments

S Henikoff et al. Protein Sci. 1997 Mar.

Abstract

We describe a new strategy for utilizing multiple sequence alignment information to detect distant relationships in searches of sequence databases. A single sequence representing a protein family is enriched by replacing conserved regions with position-specific scoring matrices (PSSMs) or consensus residues derived from multiple alignments of family members. In comprehensive tests of these and other family representations, PSSM-embedded queries produced the best results overall when used with a special version of the Smith-Waterman searching algorithm. Moreover, embedding consensus residues instead of PSSMs improved performance with readily available single sequence query searching programs, such as BLAST and FASTA. Embedding PSSMs or consensus residues into a representative sequence improves searching performance by extracting multiple alignment information from motif regions while retaining single sequence information where alignment is uncertain.

PubMed Disclaimer

Similar articles

Cited by

References

    1. Proc Natl Acad Sci U S A. 1992 Nov 15;89(22):10915-9 - PubMed
    1. Methods Enzymol. 1996;266:198-212 - PubMed
    1. Protein Sci. 1994 Jan;3(1):139-46 - PubMed
    1. Comput Appl Biosci. 1994 Feb;10(1):19-29 - PubMed
    1. Curr Opin Struct Biol. 1996 Jun;6(3):361-5 - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources