SeaView Version 4: A Multiplatform Graphical User Interface for Sequence Alignment and Phylogenetic Tree Building (original) (raw)

Journal Article

1Laboratoire de Biométrie et Biologie Evolutive, CNRS UMR 5558, Université Lyon 1, Université de Lyon, Villeurbanne, France

Search for other works by this author on:

2Méthodes et Algorithmes pour la Bioinformatique, LIRMM, CNRS UMR 5506, Université Montpellier II, Montpellier, France

3Department of Statistics, University of Auckland, Auckland, New Zealand

Search for other works by this author on:

2Méthodes et Algorithmes pour la Bioinformatique, LIRMM, CNRS UMR 5506, Université Montpellier II, Montpellier, France

Search for other works by this author on:

Published:

23 October 2009

Cite

Manolo Gouy, Stéphane Guindon, Olivier Gascuel, SeaView Version 4: A Multiplatform Graphical User Interface for Sequence Alignment and Phylogenetic Tree Building, Molecular Biology and Evolution, Volume 27, Issue 2, February 2010, Pages 221–224, https://doi.org/10.1093/molbev/msp259
Close

Navbar Search Filter Mobile Enter search term Search

Abstract

We present SeaView version 4, a multiplatform program designed to facilitate multiple alignment and phylogenetic tree building from molecular sequence data through the use of a graphical user interface. SeaView version 4 combines all the functions of the widely used programs SeaView (in its previous versions) and Phylo_win, and expands them by adding network access to sequence databases, alignment with arbitrary algorithm, maximum-likelihood tree building with PhyML, and display, printing, and copy-to-clipboard of rooted or unrooted, binary or multifurcating phylogenetic trees. In relation to the wide present offer of tools and algorithms for phylogenetic analyses, SeaView is especially useful for teaching and for occasional users of such software. SeaView is freely available at http://pbil.univ-lyon1.fr/software/seaview.

Multiple alignment and phylogenetic tree reconstruction from molecular sequence data are key tasks for many molecular evolution analyses. They involve the sequential use of several programs that perform part of the complete procedure and often require series of tedious and error-prone data reformatting to transfer sequences and trees between these programs. The computer programs SeaView and Phylo_win pioneered the use of graphical user interfaces for performing multiple sequence alignment and phylogenetic tree reconstruction (Galtier et al. 1996). These programs have been widely used but were lacking access to recently developed methods for maximum-likelihood tree estimation. We present here SeaView version 4, a program that allows its users to perform the complete phylogenetic analysis of a set of homologous DNA or protein sequences, from network-based sequence extraction from public databases to tree building and display using up-to-date alignment and maximum-likelihood tree-building algorithms (fig. 1).

The two major SeaView window types display sequence data and phylogenetic trees. Two of the displayed menus, two pilot multiple alignment algorithms and tree-building methods. Tree display tools allow printing, copy to clipboard, rerooting, zooming in and out, restricting display to a subtree, and the use of three alternative (squared, circular, and cladogram) tree-drawing formats.

FIG. 1.

SeaView can read and write the most widely used file formats defined for holding aligned or unaligned protein or nucleotide sequence data: Fasta (Pearson and Lipman 1988), interleaved Phylip (Felsenstein 1993), Clustal (Higgins et al. 1992), multiple sequence file of the Genetics Computer Group package, Nexus (Maddison et al. 1997), and Mase (Faulkner and Jurka 1988). The last two formats allow for much useful information besides sequence and name, that is, trees, species, and sites selections, sequence annotations. SeaView can also import sequence data from the major public sequence databases using a network access (Gouy and Delmotte 2008) to daily (for GenBank and EMBL) or weekly updated (for UniProt/SwissProt) databases. Imported sequences can be identified by name, accession number, or keyword and named either with their database identifier or using the species name of their organism of origin. SeaView can also directly import from nucleotide databases most feature table elements (e.g., coding sequence, rRNA, and non-codingRNA) and select those whose annotations contain a user-given character string (fig. 2).

Database sequence import dialog. This example will import into SeaView the single coding sequence (CDS) from EMBL's entry AE000782 (Archaeoglobus fulgidus complete genome sequence) containing the string (gyrA) in its database annotation and will name it Archaeoglobus.

FIG. 2.

Database sequence import dialog. This example will import into SeaView the single coding sequence (CDS) from EMBL's entry AE000782 (Archaeoglobus fulgidus complete genome sequence) containing the string (gyrA) in its database annotation and will name it Archaeoglobus.

Nucleotide sequences can be translated to protein using any user- or database-assigned genetic code, so operations such as alignment and tree building can be performed at the nucleotide or the protein levels. Unaligned protein-coding DNA sequences can be translated to protein, aligned, and displayed back as DNA sequences, a procedure that yields more realistic coding sequence alignments than would result from nucleotide-level alignment. Protein-coding DNA sequences can also be displayed by assigning the same color to all synonymous codons of the corresponding amino acid. Alignments can alternatively be displayed in reference mode, that is, where only residues that differ from the homologous one in a reference sequence are shown. Several sequence alignments can be handled simultaneously and copy/paste and concatenation operations can be performed between them. As far as display is concerned, SeaView accepts large sequence numbers (tens of thousand) and long sequences. SeaView is able to handle any number of sequence and site sets. Such sets can be named and saved in the Nexus or Mase file formats for subsequent use, by tree-building algorithms for instance.

SeaView relies on external programs to perform multiple sequence alignments. Two programs are initially available: ClustalW version 2 (Larkin et al. 2007) and Muscle (Edgar 2004). These programs are run with their default parameter values that have been chosen by their authors to perform well in most cases. When special parameter values are needed, they can be specified once using SeaView's user interface and reused for subsequent alignment operations. Alignment can be applied to all or to selected sequences or part of sequences. Profile alignment that aims at adding more sequences to a preexisting alignment can be done with both Muscle and ClustalW. SeaView is also able to drive any external sequence alignment program provided this program reads and outputs Fasta-formatted sequence data and can be run by a command line of the form “program_name arguments.” SeaView communicates with external alignment programs through a list of arguments that is initially defined by the user. This definition is made by entering once in a dialog box the list of arguments suitable for running this program, replacing the input file name by “%f.pir” and the output file name by “%f.out.” The external alignment algorithm becomes directly usable after that step. For example, SeaView's interface to T-Coffee (Notredame et al. 2000) corresponds to the following argument list

which contains the arguments expected by T_Coffee to align a Fasta-formatted file and to output its Fasta-formatted results without reordering sequences. Likewise, SeaView's interface to Probcons (Do et al. 2005) is straightforward:

SeaView is also a multiple sequence alignment editor that can be used to add or remove one or several gaps in one or several sequences simultaneously. A dot-plot analysis (Maizel and Lenk 1981) can be performed between any two sequences to visually check whether alignment algorithms missed regions with high sequence similarity.

SeaView relies on PhyML version 3 (Guindon and Gascuel 2003) for maximum-likelihood phylogenetic tree reconstruction. Here again, PhyML is used as an independent program. Thus, future updates to PhyML will be accessible to SeaView users as soon as they will have installed the revised program. Tree building can be applied to all or to selected sequences and to all or selected sequence sites. Most PhyML options can be set through the graphical interface, both for nucleotide and protein-level analyses (fig. 3). Thus, branch support can be estimated either with the approximate likelihood-ratio test (Anisimova and Gascuel 2006) or by bootstrap resampling (Felsenstein 1985).

SeaView dialog for setting PhyML options applied to nucleotide sequences.

FIG. 3.

SeaView dialog for setting PhyML options applied to nucleotide sequences.

SeaView includes two distance-based tree reconstruction methods: Neighbor-Joining (Saitou and Nei 1987; Studier and Keppler 1988) and BioNJ (Gascuel 1997). These can be applied to various nucleotide and protein sequence pairwise distances and combined with bootstrap resampling for branch-support estimation. Nucleotide-level distances are observed divergence, Jukes and Cantor, Kimura's two-parameter, Hasegawa–Kishino–Yano (see Rzhetsky and Nei 1995 for these first 4 distances), LogDet (Lake 1994), and Li's nonsynonymous (_K_a) and synonymous (_K_s) distances for protein-coding sequences (Li 1993). Protein-level distances are observed, Poisson and Kimura's (Nei 1987). Gap-containing sites are by default excluded from pairwise distance computations. Alternatively, sites that are gap-free in two sequences can be used to compute the distance for this sequence pair. The branch lengths of any user-given tree topology can be computed by minimization of the sum of squared differences between evolutionary and patristic distances (Rzhetsky and Nei 1993).

SeaView can also reconstruct maximum-parsimony phylogenetic trees using code extracted from Dnapars and Protpars programs (Phylip version 3.52, Felsenstein 1993). Parsimony computation can be combined with bootstrap resampling of sites and can be repeated a user-chosen number of times after randomly changing the input order of sequences. The parsimony score of any user-given tree can also be computed. SeaView completes parsimony analyses by computing the strict consensus of all equally parsimonious trees found.

When tree building is completed. SeaView draws the resulting phylogenetic tree on the screen (fig. 1). Plotted trees can be displayed with or without branch lengths (as when computed by parsimony), with or without branch-support values (typically, bootstrap scores or approximate Likelihood Ratio Test probabilities), binary or multifurcating, rooted or unrooted. Phylogenetic trees are initially rooted at the point in the tree that minimizes the variance of root-to-tip distances, but they can also be plotted as unrooted trees using a circular display or as cladograms containing topological but no branch-length information. The user can change the tree root and exchange the order of the two child lineages of a node. Trees can be saved to Newick, PDF, or PostScript files, and, under the Microsoft Windows and Mac OS X environments, printed or copied to the clipboard for communication with graphical tools such as Office applications. Aligned sequences can be reordered following their corresponding tree, and sequences that belong to a subtree can be selected in the corresponding alignment. Several tools such as subtree display, pattern matching in sequence names, and vertical zoom help dealing with large trees. A graphical tree editor allows topological changes by combining two basic operations: clade displacement and clade suppression.

SeaView version 4 is freely available at http://pbil.univ-lyon1.fr/software/seaview for four computer platforms (Microsoft Windows, Mac OS X, Linux, and SPARC/Solaris) and as source code.

Many software packages are available for multiple sequence alignment and phylogenetic tree reconstruction. SeaView is especially comparable with MEGA 4 that also provides an elaborate graphical user interface for multiple sequence alignment and distance or parsimony tree reconstruction and display (Tamura et al. 2007). SeaView is less versatile than MEGA for pairwise distance computations and lacks features such as neutrality or molecular tests but is unique in being available for all major computer platforms and in allowing maximum-likelihood tree reconstruction with PhyML. SeaView version 4 is especially valuable for teaching molecular phylogeny because of its availability at no fee for all users and because its user interface graphically expresses the conceptual steps involved in phylogenetic analyses. SeaView is also helpful for occasional users of phylogenetic tree reconstruction because it frees them from being confronted to many technical details concerning file formats and program options. SeaView thus pursues similar objectives to those of the phylogeny web server Phylogeny.fr (Dereeper et al. 2008) exploiting the user's computing resources. Because it performs, using PhyML, maximum-likelihood analyses at both nucleotide and protein levels, implements most current evolutionary models and computes statistical branch support, SeaView is also expected to be useful to seasoned phylogeneticists.

We are grateful to the Fast Light Toolkit team for its wonderful cross-platform graphical user interface toolkit (http://www.fltk.org). We thank Nicolas Galtier for contributing code from the Phylo_win program.

References

, .

Approximate likelihood-ratio test for branches: a fast, accurate, and powerful alternative

Syst Biol

2006

, vol.

(pg.

539

552

)

, , , et al.

(12 co-authors)

Phylogeny.fr: robust phylogenetic analysis for the non-specialist

Nucleic Acids Res

2008

, vol.

(pg.

W465

W469

)

, , , .

ProbCons: probabilistic consistency-based multiple sequence alignment

Genome Res

2005

, vol.

(pg.

330

340

)

MUSCLE: multiple sequence alignment with high accuracy and high throughput

Nucleic Acids Res

2004

, vol.

(pg.

1792

1797

)

, .

Multiple sequences alignment editor (MASE)

Trends Biochem Sci

1988

, vol.

(pg.

321

322

)

Confidence limits on phylogenies: an approach using the bootstrap

Evolution

1985

, vol.

(pg.

783

791

)

. ,

PHYLIP (phylogeny inference package) version 3.52

1993

Distributed by the author. Seattle (WA): Department of Genome Sciences, University of Washington

, , .

SEAVIEW and PHYLO_WIN: two graphic tools for sequence alignment and molecular phylogeny

Comput Appl Biosci

1996

, vol.

(pg.

543

548

)

BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data

Mol Biol Evol

1997

, vol.

(pg.

685

695

)

, .

Remote access to ACNUC nucleotide and protein sequence databases at PBIL

Biochimie

2008

, vol.

(pg.

555

562

)

, .

A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood

Syst Biol

2003

, vol.

(pg.

696

704

)

, , .

CLUSTAL V: improved software for multiple sequence alignment

Comput Appl Biosci

1992

, vol.

(pg.

189

191

)

Reconstructing evolutionary trees from DNA and protein sequences: paralinear distances

Proc Natl Acad Sci USA

1994

, vol.

(pg.

1455

1459

)

, , , et al.

(13 co-authors)

Clustal W and Clustal X version 2.0

Bioinformatics

2007

, vol.

(pg.

2947

2948

)

Unbiased estimation of the rates of synonymous and nonsynonymous substitution

J Mol Evol

1993

, vol.

(pg.

)

, , .

NEXUS: an extensible file format for systematic information

Syst Biol

1997

, vol.

(pg.

590

621

)

, .

Enhanced graphic matrix analysis of nucleic acid and protein sequences

Proc Natl Acad Sci USA

1981

, vol.

(pg.

7665

7669

)

. ,

Molecular evolutionary genetics

1987

New York

Columbia University Press

, , .

T-Coffee: a novel method for multiple sequence alignments

J Mol Biol

2000

, vol.

302

(pg.

205

217

)

, .

Improved tools for biological sequence comparison

Proc Natl Acad Sci USA

1988

, vol.

(pg.

2444

2448

)

, .

Theoretical foundation of the minimum-evolution method of phylogenetic inference

Mol Biol Evol

1993

, vol.

(pg.

1073

1095

)

, .

Tests of applicability of several substitution models for DNA sequence data

Mol Biol Evol

1995

, vol.

(pg.

131

151

)

, .

The neighbor-joining method: a new method for reconstructing phylogenetic trees

Mol Biol Evol

1987

, vol.

(pg.

406

425

)

, .

A note on the neighbor-joining algorithm of Saitou and Nei

Mol Biol Evol

1988

, vol.

(pg.

729

731

)

, , , .

MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) Software Version 4.0

Mol Biol Evol

2007

, vol.

(pg.

1596

1599

)

Author notes

Associate editor: Sudhir Kumar

© The Author 2009. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org

Citations

Views

Altmetric

Metrics

Total Views 27,575

21,389 Pageviews

6,186 PDF Downloads

Since 1/1/2017

Month:	Total Views:
January 2017	15
February 2017	240
March 2017	211
April 2017	187
May 2017	211
June 2017	123
July 2017	147
August 2017	113
September 2017	147
October 2017	154
November 2017	172
December 2017	463
January 2018	415
February 2018	487
March 2018	582
April 2018	590
May 2018	414
June 2018	339
July 2018	370
August 2018	285
September 2018	390
October 2018	367
November 2018	490
December 2018	418
January 2019	282
February 2019	275
March 2019	439
April 2019	424
May 2019	410
June 2019	320
July 2019	316
August 2019	305
September 2019	270
October 2019	345
November 2019	375
December 2019	373
January 2020	261
February 2020	260
March 2020	301
April 2020	380
May 2020	240
June 2020	347
July 2020	365
August 2020	287
September 2020	272
October 2020	315
November 2020	367
December 2020	253
January 2021	352
February 2021	266
March 2021	413
April 2021	314
May 2021	304
June 2021	265
July 2021	207
August 2021	189
September 2021	282
October 2021	281
November 2021	340
December 2021	364
January 2022	331
February 2022	241
March 2022	319
April 2022	285
May 2022	316
June 2022	227
July 2022	251
August 2022	231
September 2022	239
October 2022	274
November 2022	239
December 2022	257
January 2023	265
February 2023	257
March 2023	297
April 2023	329
May 2023	259
June 2023	268
July 2023	341
August 2023	221
September 2023	221
October 2023	256
November 2023	261
December 2023	373
January 2024	344
February 2024	225
March 2024	291
April 2024	317
May 2024	305
June 2024	224
July 2024	239
August 2024	179
September 2024	204

SeaView Version 4: A Multiplatform Graphical User Interface for Sequence Alignment and Phylogenetic Tree Building (original) (raw)

Cite

Abstract

References

Author notes

Citations

Views

Altmetric

Email alerts

Email alerts

Citing articles via

Latest

Most Cited

SeaView Version 4: A Multiplatform Graphical User Interface for Sequence Alignment and Phylogenetic Tree Building (original) (raw)

Cite

Abstract

References

Author notes

Citations

Views

Altmetric

Email alerts

Email alerts

Citing articles via

Latest

Most Read

Most Cited