PhyloProfile: dynamic visualization and exploration of multi-layered phylogenetic profiles (original) (raw)

TreeDomViewer: a tool for the visualization of phylogeny and protein domain structure

Nucleic acids research, 2006

Phylogenetic analysis and examination of protein domains allow accurate genome annotation and are invaluable to study proteins and protein complex evolution. However, two sequences can be homologous without sharing statistically significant amino acid or nucleotide identity, presenting a challenging bioinformatics problem. We present TreeDomViewer, a visualization tool available as a web-based interface that combines phylogenetic tree description, multiple sequence alignment and InterProScan data of sequences and generates a phylogenetic tree projecting the corresponding protein domain information onto the multiple sequence alignment. Thereby it makes use of existing domain prediction tools such as InterProScan. TreeDomViewer adopts an evolutionary perspective on how domain structure of two or more sequences can be aligned and compared, to subsequently infer the function of an unknown homolog. This provides insight into the function assignment of, in terms of amino acid substitution...

PhyloGene server for identification and visualization of co-evolving proteins using normalized phylogenetic profiles

Nucleic acids research, 2015

Proteins that function in the same pathways, protein complexes or the same environmental conditions can show similar patterns of sequence conservation across phylogenetic clades. In species that no longer require a specific protein complex or pathway, these proteins, as a group, tend to be lost or diverge. Analysis of the similarity in patterns of sequence conservation across a large set of eukaryotes can predict functional associations between different proteins, identify new pathway members and reveal the function of previously uncharacterized proteins. We used normalized phylogenetic profiling to predict protein function and identify new pathway members and disease genes. The phylogenetic profiles of tens of thousands conserved proteins in the human, mouse, Caenorhabditis elegans and Drosophila genomes can be queried on our new web server, PhyloGene. PhyloGene provides intuitive and user-friendly platform to query the patterns of conservation across 86 animal, fungal, plant and p...

ggtreeExtra: Compact Visualization of Richly Annotated Phylogenetic Data

Molecular Biology and Evolution, 2021

We present the ggtreeExtra package for visualizing heterogeneous data with a phylogenetic tree in a circular or rectangular layout (https://www.bioconductor.org/packages/ggtreeExtra). The package supports more data types and visualization methods than other tools. It supports using the grammar of graphics syntax to present data on a tree with richly annotated layers and allows evolutionary statistics inferred by commonly used software to be integrated and visualized with external data. GgtreeExtra is a universal tool for tree data visualization. It extends the applications of the phylogenetic tree in different disciplines by making more domain-specific data to be available to visualize and interpret in the evolutionary context.

PhyloDome--visualization of taxonomic distributions of domains occurring in eukaryote protein sequence sets

Nucleic Acids Research, 2005

The analysis of taxonomic distribution and lineagespecific variation of domains and domain combinations is an important step in the assessment of their functional roles and potential interoperability. In the study of eukaryote sequence sets with many multidomain proteins, it can become laborious to evaluate the phylogenetic context of the many occurring domains and their mutual relationships. PhyloDome is an answer to that problem. It provides a fast overview on the taxonomic spreading and potential interrelation of domains that are either given as a list of names and PFAM/SMART accessions or derived from a user-defined set of sequences. This taxonomic distribution analysis can be helpful in protein function and interaction assignment as the comparative study of potential Hedgehog pathway members in C.elegans shows. An implementation of PhyloDome is accessible for public use as a WWW-Service at

Visualization of multiple alignments, phylogenies and gene family evolution

Nature Methods, 2010

Tree and sequence alignment visualizations have a long history. Evolutionary tree diagrams can be found in even the earliest descriptions of evolution, and their visualization still plays a key role in modern phylogenetics. However, although trees visualize an organism's evolutionary history, it is the biological data used in their construction that contains the information that distinguishes each organism. Sequence alignments are the most common data used in phylogenetic analysis, and their visualization assists in understanding the molecular mechanisms that differentiate each species, down to the level of the individual nucleotide bases and amino acids.

OrthoQuery: A Tripal Database Module to Assess and Visualize Gene Family Evolution

2018

Background: The abundance of transcriptomic resources for non-model organisms has enabled researchers to study comparative genomics on a larger scale. Generation of orthologous gene families facilitate the detection of genome duplication events and allows researchers to refine phylogenetic relationships and examine gene family evolution. Comparisons across orthogroups support analyzing selection pressure and novel gene families. Applications developed to study gene homology among species do not allow users to query data directly from external databases hosting resources not associated with a genome reference. In addition, real time computation of orthogroups for user selected subsets paired with interactive visualizations is lacking. Results: OrthoQuery, a web-based Tripal module, provides a semi-automated analytical framework to enable comparisons among curated proteins and interactive visualizations in context of the resulting species tree. OrthoFinder, optimized with Diamond, is ...

SUPERFAMILY—sophisticated comparative genomics, data mining, visualization and phylogeny

Nucleic Acids Research, 2008

SUPERFAMILY provides structural, functional and evolutionary information for proteins from all completely sequenced genomes, and large sequence collections such as UniProt. Protein domain assignments for over 900 genomes are included in the database, which can be accessed at http://supfam. org/. Hidden Markov models based on Structural Classification of Proteins (SCOP) domain definitions at the superfamily level are used to provide structural annotation. We recently produced a new model library based on SCOP 1.73. Family level assignments are also available. From the web site users can submit sequences for SCOP domain classification; search for keywords such as superfamilies, families, organism names, models and sequence identifiers; find over-and underrepresented families or superfamilies within a genome relative to other genomes or groups of genomes; compare domain architectures across selections of genomes and finally build multiple sequence alignments between Protein Data Bank (PDB), genomic and custom sequences. Recent extensions to the database include InterPro abstracts and Gene Ontology terms for superfamiles, taxonomic visualization of the distribution of families across the tree of life, searches for functionally similar domain architectures and phylogenetic trees. The database, models and associated scripts are available for download from the ftp site.

ProViz-a web-based visualization tool to investigate the functional and evolutionary features of protein sequences

Nucleic acids research, 2016

Low-throughput experiments and high-throughput proteomic and genomic analyses have created enormous quantities of data that can be used to explore protein function and evolution. The ability to consolidate these data into an informative and intuitive format is vital to our capacity to comprehend these distinct but complementary sources of information. However, existing tools to visualize protein-related data are restricted by their presentation, sources of information, functionality or accessibility. We introduce ProViz, a powerful browser-based tool to aid biologists in building hypotheses and designing experiments by simplifying the analysis of functional and evolutionary features of proteins. Feature information is retrieved in an automated manner from resources describing protein modular architecture, post-translational modification, structure, sequence variation and experimental characterization of functional regions. These features are mapped to evolutionary information from p...

Functional genome annotation through phylogenomic mapping

Nature Biotechnology, 2005

Accurate determination of functional interactions among proteins at the genome level remains a challenge for genomic research. Here we introduce a genome-scale approach to functional protein annotation-phylogenomic mapping-that requires only sequence data, can be applied equally well to both finished and unfinished genomes, and can be extended beyond single genomes to annotate multiple genomes simultaneously. We have developed and applied it to more than 200 sequenced bacterial genomes. Proteins with similar evolutionary histories were grouped together, placed on a three dimensional map and visualized as a topographical landscape. The resulting phylogenomic maps display thousands of proteins clustered in mountains on the basis of coinheritance, a strong indicator of shared function. In addition to systematic computational validation, we have experimentally confirmed the ability of phylogenomic maps to predict both mutant phenotype and gene function in the delta proteobacterium Myxococcus xanthus.

Interactive web-based visualization of phylogenetic trees using Phylogeny.IO

2016

Traditional static publication formats make visualization, exploration and sharing of massive phylogenetic trees difficult. Web-based technologies, such as the Data Driven Document (D3) JavaScript library, exist to overcome such challenges by allowing interactive display of complex data sets. We here we an open-source web-based application that applies the power of D3 to the visualization of phylogenetic trees. Phylogeny.IO (http://phyloeny.io) displays trees together with a range of static (e.g., such as shapes and colors) and dynamic (e.g., pop-up text and images) annotations. Annotated trees can be shared as IFrame HTML objects easily embeddable in any web page.