The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools (original) (raw)
Abstract
CLUSTAL X is a new windows interface for the widely-used progressive multiple sequence alignment program CLUSTAL W. The new system is easy to use, providing an integrated system for performing multiple sequence and profile alignments and analysing the results. CLUSTAL X displays the sequence alignment in a window on the screen. A versatile sequence colouring scheme allows the user to highlight conserved features in the alignment. Pull-down menus provide all the options required for traditional multiple sequence and profile alignment. New features include: the ability to cut-and-paste sequences to change the order of the alignment, selection of a subset of the sequences to be realigned, and selection of a sub-range of the alignment to be realigned and inserted back into the original alignment. Alignment quality analysis can be performed and low-scoring segments or exceptional residues can be highlighted. Quality analysis and realignment of selected residue ranges provide the user with a powerful tool to improve and refine difficult alignments and to trap errors in input sequences. CLUSTAL X has been compiled on SUN Solaris, IRIX5.3 on Silicon Graphics, Digital UNIX on DECstations, Microsoft Windows (32 bit) for PCs, Linux ELF for x86 PCs, and Macintosh PowerMac.
Full Text
The Full Text of this article is available as a PDF (357.0 KB).
Selected References
These references are in PubMed. This may not be the complete list of references from this article.
- Benner S. A., Cohen M. A., Gonnet G. H. Amino acid substitution during functionally constrained divergent evolution of protein sequences. Protein Eng. 1994 Nov;7(11):1323–1332. doi: 10.1093/protein/7.11.1323. [DOI] [PubMed] [Google Scholar]
- Birney E., Thompson J. D., Gibson T. J. PairWise and SearchWise: finding the optimal alignment in a simultaneous comparison of a protein profile against all DNA translation frames. Nucleic Acids Res. 1996 Jul 15;24(14):2730–2739. doi: 10.1093/nar/24.14.2730. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brouillet S., Risler J. L., Hénaut A., Slonimski P. P. Evolutionary divergence plots of homologous proteins. Biochimie. 1992 Jun;74(6):571–580. doi: 10.1016/0300-9084(92)90157-a. [DOI] [PubMed] [Google Scholar]
- Clark S. P. MALIGNED: a multiple sequence alignment editor. Comput Appl Biosci. 1992 Dec;8(6):535–538. doi: 10.1093/bioinformatics/8.6.535. [DOI] [PubMed] [Google Scholar]
- De Rijk P., De Wachter R. DCSE, an interactive tool for sequence alignment and secondary structure research. Comput Appl Biosci. 1993 Dec;9(6):735–740. doi: 10.1093/bioinformatics/9.6.735. [DOI] [PubMed] [Google Scholar]
- Feng D. F., Doolittle R. F. Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J Mol Evol. 1987;25(4):351–360. doi: 10.1007/BF02603120. [DOI] [PubMed] [Google Scholar]
- Friemann A., Schmitz S. A new approach for displaying identities and differences among aligned amino acid sequences. Comput Appl Biosci. 1992 Jun;8(3):261–265. doi: 10.1093/bioinformatics/8.3.261. [DOI] [PubMed] [Google Scholar]
- Galtier N., Gouy M., Gautier C. SEAVIEW and PHYLO_WIN: two graphic tools for sequence alignment and molecular phylogeny. Comput Appl Biosci. 1996 Dec;12(6):543–548. doi: 10.1093/bioinformatics/12.6.543. [DOI] [PubMed] [Google Scholar]
- Gotoh O. Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments. J Mol Biol. 1996 Dec 13;264(4):823–838. doi: 10.1006/jmbi.1996.0679. [DOI] [PubMed] [Google Scholar]
- Henikoff S., Henikoff J. G. Position-based sequence weights. J Mol Biol. 1994 Nov 4;243(4):574–578. doi: 10.1016/0022-2836(94)90032-9. [DOI] [PubMed] [Google Scholar]
- Higgins D. G., Bleasby A. J., Fuchs R. CLUSTAL V: improved software for multiple sequence alignment. Comput Appl Biosci. 1992 Apr;8(2):189–191. doi: 10.1093/bioinformatics/8.2.189. [DOI] [PubMed] [Google Scholar]
- Higgins D. G., Sharp P. M. CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. Gene. 1988 Dec 15;73(1):237–244. doi: 10.1016/0378-1119(88)90330-7. [DOI] [PubMed] [Google Scholar]
- Higgins D. G., Thompson J. D., Gibson T. J. Using CLUSTAL for multiple sequence alignments. Methods Enzymol. 1996;266:383–402. doi: 10.1016/s0076-6879(96)66024-8. [DOI] [PubMed] [Google Scholar]
- Livingstone C. D., Barton G. J. Protein sequence alignments: a strategy for the hierarchical analysis of residue conservation. Comput Appl Biosci. 1993 Dec;9(6):745–756. doi: 10.1093/bioinformatics/9.6.745. [DOI] [PubMed] [Google Scholar]
- Notredame C., Higgins D. G. SAGA: sequence alignment by genetic algorithm. Nucleic Acids Res. 1996 Apr 15;24(8):1515–1524. doi: 10.1093/nar/24.8.1515. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Parry-Smith D. J., Attwood T. K. SOMAP: a novel interactive approach to multiple protein sequences alignment. Comput Appl Biosci. 1991 Apr;7(2):233–235. doi: 10.1093/bioinformatics/7.2.233. [DOI] [PubMed] [Google Scholar]
- Sander C., Schneider R. Database of homology-derived protein structures and the structural meaning of sequence alignment. Proteins. 1991;9(1):56–68. doi: 10.1002/prot.340090107. [DOI] [PubMed] [Google Scholar]
- Schuler G. D., Altschul S. F., Lipman D. J. A workbench for multiple alignment construction and analysis. Proteins. 1991;9(3):180–190. doi: 10.1002/prot.340090304. [DOI] [PubMed] [Google Scholar]
- Smith R. F., Smith T. F. Automatic generation of primary sequence patterns from sets of related protein sequences. Proc Natl Acad Sci U S A. 1990 Jan;87(1):118–122. doi: 10.1073/pnas.87.1.118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stockwell P. A., Petersen G. B. HOMED: a homologous sequence editor. Comput Appl Biosci. 1987 Mar;3(1):37–43. doi: 10.1093/bioinformatics/3.1.37. [DOI] [PubMed] [Google Scholar]
- Taylor W. R. A flexible method to align large numbers of biological sequences. J Mol Evol. 1988 Dec;28(1-2):161–169. doi: 10.1007/BF02143508. [DOI] [PubMed] [Google Scholar]
- Thirup S., Larsen N. E. ALMA, an editor for large sequence alignments. Proteins. 1990;7(3):291–295. doi: 10.1002/prot.340070310. [DOI] [PubMed] [Google Scholar]
- Thompson J. D., Higgins D. G., Gibson T. J. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994 Nov 11;22(22):4673–4680. doi: 10.1093/nar/22.22.4673. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thompson J. D., Higgins D. G., Gibson T. J. Improved sensitivity of profile searches through the use of sequence weights and gap excision. Comput Appl Biosci. 1994 Feb;10(1):19–29. doi: 10.1093/bioinformatics/10.1.19. [DOI] [PubMed] [Google Scholar]
- Vingron M., Argos P. Motif recognition and alignment for many sequences by comparison of dot-matrices. J Mol Biol. 1991 Mar 5;218(1):33–43. doi: 10.1016/0022-2836(91)90871-3. [DOI] [PubMed] [Google Scholar]
- Vingron M., Sibbald P. R. Weighting in sequence space: a comparison of methods in terms of generalized sequences. Proc Natl Acad Sci U S A. 1993 Oct 1;90(19):8777–8781. doi: 10.1073/pnas.90.19.8777. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zvelebil M. J., Barton G. J., Taylor W. R., Sternberg M. J. Prediction of protein secondary structure and active sites using the alignment of homologous sequences. J Mol Biol. 1987 Jun 20;195(4):957–961. doi: 10.1016/0022-2836(87)90501-8. [DOI] [PubMed] [Google Scholar]