PredictProtein—an open resource for online prediction of protein structural and functional features (original) (raw)

Journal Article

,

1Department of Informatics, Bioinformatics & Computational Biology i12, TUM (Technische Universität München), Garching/Munich 85748, Germany

2Biosof LLC, New York, NY 10001, USA

3TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), TUM (Technische Universität München), Garching/Munich 85748, Germany

*To whom correspondence should be addressed. Tel: +49 (89) 289-17811; Fax: +49 (89) 289-19414; Email: gyachdav@rostlab.org

Search for other works by this author on:

,

1Department of Informatics, Bioinformatics & Computational Biology i12, TUM (Technische Universität München), Garching/Munich 85748, Germany

4New York Consortium on Membrane Protein Structure (NYCOMPS), Columbia University, New York, NY 10032, USA

Search for other works by this author on:

,

1Department of Informatics, Bioinformatics & Computational Biology i12, TUM (Technische Universität München), Garching/Munich 85748, Germany

Search for other works by this author on:

,

1Department of Informatics, Bioinformatics & Computational Biology i12, TUM (Technische Universität München), Garching/Munich 85748, Germany

3TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), TUM (Technische Universität München), Garching/Munich 85748, Germany

Search for other works by this author on:

,

1Department of Informatics, Bioinformatics & Computational Biology i12, TUM (Technische Universität München), Garching/Munich 85748, Germany

3TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), TUM (Technische Universität München), Garching/Munich 85748, Germany

Search for other works by this author on:

,

1Department of Informatics, Bioinformatics & Computational Biology i12, TUM (Technische Universität München), Garching/Munich 85748, Germany

Search for other works by this author on:

,

5Department of Genome Oriented Bioinformatics, Technische Universität München, Wissenschaftszentrum Weihenstephan, Freising 85354, Germany

Search for other works by this author on:

,

1Department of Informatics, Bioinformatics & Computational Biology i12, TUM (Technische Universität München), Garching/Munich 85748, Germany

Search for other works by this author on:

,

1Department of Informatics, Bioinformatics & Computational Biology i12, TUM (Technische Universität München), Garching/Munich 85748, Germany

Search for other works by this author on:

,

1Department of Informatics, Bioinformatics & Computational Biology i12, TUM (Technische Universität München), Garching/Munich 85748, Germany

Search for other works by this author on:

... Show more

Received:

21 February 2014

Revision received:

04 April 2014

Cite

Guy Yachdav, Edda Kloppmann, Laszlo Kajan, Maximilian Hecht, Tatyana Goldberg, Tobias Hamp, Peter Hönigschmid, Andrea Schafferhans, Manfred Roos, Michael Bernhofer, Lothar Richter, Haim Ashkenazy, Marco Punta, Avner Schlessinger, Yana Bromberg, Reinhard Schneider, Gerrit Vriend, Chris Sander, Nir Ben-Tal, Burkhard Rost, PredictProtein—an open resource for online prediction of protein structural and functional features, Nucleic Acids Research, Volume 42, Issue W1, 1 July 2014, Pages W337–W343, https://doi.org/10.1093/nar/gku366
Close

Navbar Search Filter Mobile Enter search term Search

Abstract

PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of structure (secondary structure, solvent accessibility, transmembrane helices (TMSEG) and strands, coiled-coil regions, disulfide bonds and disordered regions) and function. The service incorporates analysis methods for the identification of functional regions (ConSurf), homology-based inference of Gene Ontology terms (metastudent), comprehensive subcellular localization prediction (LocTree3), protein–protein binding sites (ISIS2), protein–polynucleotide binding sites (SomeNA) and predictions of the effect of point mutations (non-synonymous SNPs) on protein function (SNAP2). Our goal has always been to develop a system optimized to meet the demands of experimentalists not highly experienced in bioinformatics. To this end, the PredictProtein results are presented as both text and a series of intuitive, interactive and visually appealing figures. The web server and sources are available at http://ppopen.rostlab.org.

INTRODUCTION

Molecular biology is moving into the high-throughput mode as the number of experiments needed to support a single hypothesis is rapidly growing. The line between experimental result and computational analysis is blurring; this also shifts what constitutes a reliable annotation. On top, the vast amount of life science data outpaces computer power. For example, less than 1% of the over 51 million sequences in UniProt (February 2014) (1) have some expert annotations in Swiss-Prot. This protein annotation gap widens every day (2). PredictProtein is one of the resources applicable to all proteins that contribute to closing this gap.

The PredictProtein (PP) server is an automatic service that searches up-to-date public sequence databases, creates alignments, and predicts aspects of protein structure and function. In 1992, PredictProtein went online as one of the first Internet servers in molecular biology at the EMBL (Heidelberg, Germany). From 1999 to 2009, the server operated from Columbia University (New York, NY) and in 2009 it moved to the TUM (Munich, Germany). PredictProtein was one of the first services realizing state-of-the-art protein sequence analysis, and the prediction of structural and functional features in a single server. While many outstanding services (3) have expanded on some of those aspects, PredictProtein has remained one of the most comprehensive resources. The thousands of citations to PredictProtein and to our methods demonstrate the server's applicability and acceptance. Since 2009, for example, its website was visited more than one million times by about 80 000 unique visitors per year from 139 countries. Furthermore, over 500 000 sequences were submitted and processed by the service. About half of all submitted sequences were not in UniProt (1) at the time of submission. This suggests that the server's primary utility is in providing annotations for uncharacterized proteins. The following two central principles have guided the evolution of PredictProtein.

  1. Sustained quality with performance estimates. The performance of many tools is not sufficiently assessed and/or their performance does not sustain over time. Two decades of Critical Assessment of protein Structure Prediction (CASP)-like experiments (4,5) have demonstrated this repeatedly. PredictProtein went online with a method for the prediction of protein secondary structure (PHD (6)) and 22 years later the performance estimates for that method continue to be valid: a unique achievement.
  2. Ease of use. From the beginning we have aspired to make the use of our tools intuitive for all users. Unfortunately, the growth in size and scope continues to challenge the realization of this guiding principle. In 1992, the service provided alignments and secondary structure prediction; in 2014, it includes over 30 complex tools. Creating a unified, natural interface for these tools is challenging. Furthermore, we need to invest more resources to sustain the increasing usage as the data flood surges on. For example, most of our CPU goes into running PSI-BLAST (7). Since 2009, databases grew 10-fold whereas the CPU speed has only tripled, i.e. we need at least three times the number of CPUs we currently have to achieve the same ease in handling each job.

METHODS

PredictProtein incorporates over 30 tools

Supplementary Table S1, Supporting Online Material provides a comprehensive list of all components. Database searches: sequences similar to the query are identified by standard, pairwise BLAST (8) and iterated PSI-BLAST (7) searches (9,10) against a non-redundant combination of PDB (11), Swiss-Prot (12) and TrEMBL (1). In addition, functional motifs are taken from PROSITE (13) and domains from Pfam (14). Prediction of structural features: predicted aspects of structure include PROFphd secondary structure and solvent accessibility (15,16), PROFtmb transmembrane strands (17), TMSEG transmembrane helices, COILS coiled-coil regions (18), DISULFIND disulfide bonds (19) and SEG low-complexity regions (20). Disordered regions are predicted by a set of tools: UCON (21), NORSnet (22), PROFbval (23,24) and Meta-Disorder (25). Prediction of functional features: predicted aspects include ConSurf annotations and visualizations of functionally important sites (26,27), protein mutability landscape analysis showing the effect of point mutations on protein function predicted by SNAP2 (28), Gene Ontology (GO) terms from metastudent (29), LocTree3 predictions of subcellular localization (30), protein–protein interaction sites (ISIS2) and protein–DNA, protein–RNA binding sites (SomeNA). Almost all prediction methods use evolutionary information obtained from PSI-BLAST searches; the more related protein sequences are found and the more divergent those are, the higher the gain in performance (10,15). However, none of the methods (with the exception of metastudent, see below) relies solely on profiles and the prediction without a profile is significantly better than random. For most prediction methods (e.g. LocTree3 and SNAP2) the prediction quality is estimated by a reliability score. In the following, we introduce some of the recent and upcoming additions since 2004 (31) in more detail.

New: TMSEG transmembrane helix predictions

TMSEG (Bernhofer, M. et al., in preparation) predicts alpha-helical transmembrane proteins, the position of transmembrane helices, and membrane topology. The method uses a novel segment-based neural network to refine the final prediction. TMSEG was developed and evaluated on 166 transmembrane proteins extracted from PDBTM (32) and OPM (33), and on 1441 proteins from the SignalP4.1 dataset (34). In our hands, TMSEG appears to complement and improve over the best existing methods (e.g. PolyPhobius (35) and Memsat3 (36)) predicting all membrane helices correctly for about 60% of all proteins. The method correctly identifies 98% of all transmembrane proteins with a false positive rate of less than 2%.

New: SNAP2 predict effect of mutations upon function

SNAP2 predicts the effect of single amino acid substitutions on protein function (37). It improves over its predecessor SNAP (38) by using additional coarse-grained features that better classify samples with unclear evidence. With a two-state accuracy of 83% and an AUC of 0.91, SNAP2 performs on par or better than other state-of-the-art methods on human variants while significantly outperforming these methods for other organisms. SNAP2 is the only available method predicting the effect of point mutations even without alignment information (if fewer than 10 related proteins are found, a specific method is applied with an expected accuracy of ∼70% instead of 83%). For each protein we also predict the entire protein mutability landscape (28,39), i.e. the functional effect of all possible point mutations. The results are displayed in a heatmap representation (40) of functional effects (Figure 1C).

Figure 1.

Visual results from PredictProtein (PP). The PP Dashboard Viewer shows a schematic of all position-based predictions and sequence alignments. (A) Putative protein (UniProt AC E5A5U3). (B) ER membrane protein complex subunit 4 (EMC4, UniProt AC Q5J8M3). The protein sequence is represented by a scale on top of the predicted features. Features presented include protein–protein binding sites (ISIS2), disulfide bonds (DISULFIND), structural features such as secondary structure state and solvent accessibility (PROFphd), transmembrane helices (TMSEG) and disordered regions (MD). Proteins aligned by PSI-BLAST (7) are shown as thin lines colored by database origin (PDB (11), Swiss-Prot (12) and TrEMBL (1)). Clicking on each line links to the database entry of the hit. For all elements, tooltips disclose the annotated feature, its position in the sequence and its type (prediction versus database search). (C) A complete analysis of the functional effect of point mutations on EMC4 shown in a heatmap (SNAP2). (D) Predicted GO terms (metastudent) for EMC4 in tabular format. (E) The predicted cellular compartment, ER membrane, for EMC4 (LocTree3) is highlighted in green in a schematic of a eukaryotic cell.

New: LocTree3 subcellular localization for all domains of life

LocTree3 predicts subcellular localization for proteins in all domains of life (30). The method predicts the localization in 18 classes (8 classes for transmembrane and 10 classes for soluble proteins) for eukaryotes, in 6 for bacteria and in 3 for archaea. LocTree3 successfully combines de novo (41) and homology-based predictions (7), reaching an 18-state prediction accuracy over 80% for eukaryotes and a 6-state accuracy over 89% for bacteria. The high level of performance and the large number of predicted classes make LocTree3 the most comprehensive and most accurate tool for subcellular localization prediction.

New: metastudent infers GO terms by homology

The method metastudent (29) predicts GO (42) terms through homology inference. It first BLASTs queries against proteins with experimental GO annotations taken from Swiss-Prot (12), i.e. when no hit to any protein with experimentally annotated GO term is returned, no prediction is made. Then, three algorithms independently choose which GO terms to inherit. These differ in the amount and quality of alignment hits considered and how they assign a probability to each GO term. A meta-classifier combines the three through linear regression. metastudent achieves a maximum F1 score of 0.36 in the biological process ontology and of 0.48 in the molecular function ontology (29). Although this is slightly worse (within the error estimates (43)) than the best method for predicting GO terms (44), the advantage is that metastudent predictions can easily be traced back to the experimental annotations upon which they are based.

Recent: Meta-Disorder prediction of protein disorder

Intrinsically disordered or unstructured regions in proteins do not fold into well-defined three-dimensional (3D) structures when in isolation, but may become structured upon binding to a substrate. Because of the heterogeneity of disordered regions, we have developed several methods predicting different types of disorders. UCON (21) combines protein-specific pairwise contacts predicted by PROFcon (45) with pairwise statistical potentials to predict long disordered regions that are rendered intrinsically unstructured by few internal connections. NORSnet (22) predicts disordered regions with NO Regular Secondary structure (NORS (46), i.e. long loops), separating very long disordered loops predicted by NORSp (47) from all other regions in the PDB (11). PROFbval (23,24), trained on B-values in X-ray structures, predicts flexible residues in short disordered regions. Meta-Disorder (25) is a neural-network-based meta-predictor that uses different sources of information, including the orthogonal disorder predictors mentioned above and others, e.g. IUPred (48) and DISOPRED (49). Meta-Disorder significantly outperforms its constituents (25,50). A comprehensive, independent study (50), on disordered regions from the PDB and DisProt (51), suggested Meta-Disorder to be one of the top two methods available.

Recent: protein–protein binding sites

Residues that can bind other proteins are now predicted by ISIS2 instead of ISIS (52). ISIS splits a query sequence into windows of nine consecutive residues, encoding each window as a vector of features (e.g. PSI-BLAST amino acid conservation frequencies or predicted secondary structure). A neural network, trained on existing protein–protein binding residue annotations, determines whether a query residue can bind other proteins. ISIS2 has been trained on a large dataset of PDB-annotated binding sites (53). A faster neural network implementation (53) and new methods for predicting residue features further improve the accuracy of ISIS2.

Recent: protein–DNA, protein–RNA binding sites

Protein–polynucleotide binding underlies important processes such as replication and transcription. SomeNA (54) predicts protein–polynucleotide binding on three levels. First, it predicts which proteins bind nucleotides. Second, it predicts the type of binding (RNA or DNA or both). Third, it predicts the protein residues that bind DNA or RNA. The first step is performed best: 77% of the proteins are correctly predicted to bind DNA and RNA. The distinction between the type of nucleotide is slightly more difficult: 74% of the proteins predicted to bind DNA and 72% of the proteins predicted to bind RNA were correct. Slightly over 53% of the residues binding DNA and/or RNA were correctly predicted. These levels of performance are at least 3-fold higher than random.

Recent: ConSurf conservation of surfaces explains function

ConSurf (26,27) estimates the evolutionary rate in protein families. These rates are useful for protein structure and function prediction because they reflect constrains imposed on the general evolutionary drift (10,15,55). Queried with a protein sequence, ConSurf first finds related sequences in UniProt (1). Evolutionary rates of amino acids are estimated based on evolutionary relatedness between the protein and its homologues using either empirical Bayesian (56) or maximum likelihood (57) methods. The strength of these methods is that they rely on the phylogeny of the sequences and thus can accurately distinguish between conservation due to short evolutionary time and conservation resulting from importance for maintaining protein foldability and function. If a structure is available, ConSurf maps the patterns of conservation upon the 3D structure. These patterns reveal crucial details about protein function.

WEB SERVER—UPDATES AND SOFTWARE

Graphical front-end

The dashboard page of PredictProtein results uses the BioJS (58) FeatureViewer component to show protein features (Figure 1A and B). Along the protein sequence, features are indicated by color and single residue pins. Depending on the protein, the overview features may include predictions of secondary structure and solvent accessibility, transmembrane helices, disulfide bonds and disordered regions. Details are available by zooming-in on local regions. Other views present additional annotations and predictions, e.g. functional landscapes of the effect of point mutations (SNAP2, Figure 1C), predicted GO terms (metastudent, Figure 1D) or subcellular localization (LocTree3, Figure 1E). In the dashboard viewer, users can mouse over the different view landmarks to reveal more information on the annotations.

The website features a Help section that includes interactive and instructive presentations. Each result section also provides a Help tab with specific explanations. All result pages feature an interactive Export menu for the download of selected raw data, as well as of the compiled archive with all data generated by the server. Additionally, we provide machine-readable output in XML and JSON. Output formatted for web presentations is available (HTML link at top right corner of main result page). The HTML view—most familiar to long-time users—aggregates results from most of the integrated methods in one page. This page also contains information that has not been integrated into the graphical view—yet—including results generated by some component methods and prediction confidence values. While we are working on the integration of all results into the graphical view, we highly encourage users to inspect this ‘raw’ HTML view. Finally, output is also available in text format (TEXT link, top right corner of results).

PPcache: pre-calculated results versus interactive jobs

One of the most beneficial recent resources from PredictProtein is the PPcache—a database that currently holds pre-calculated results for 11.7 million unique proteins—including all proteins of model organisms. If pre-calculated results are available for a PredictProtein query in PPcache, these are immediately returned. For results older than three months, users are given the option to re-run the query, thereby updating the PPcache. If no result exists in the PPcache, the job is processed, and users are notified upon job completion. PPcache currently requires roughly 100TB of disk space. We plan to open this repository for public access through a specialized API.

Downloadable software: packages and cloud-ready virtual machine

For full proteome analysis we make the full PredictProtein software suite available for download to be run either by installing the software packages on local machines or by deploying a virtual machine image in the cloud. Most methods from the PredictProtein pipeline are now available as open-source packages and are freely distributed through Debian (59) and Ubuntu. Following the Debian guidelines enforces best practices for software development and distribution and guarantees robustness, usability and maintainability of our software packages.

Users with access to cloud computing can download the PredictProtein Machine Image or PPMI (60), a disk image optimized for deployment in the cloud. The PPMI is bootable on server instances in cloud infrastructure services, or on locally installed virtualization software.

USE CASE

We demonstrate the usability and properties of PredictProtein through a simple example, the human endoplasmatic reticulum (ER) membrane protein complex subunit 4 (EMC4, UniProt AC Q5J8M3; Figure 1B–E). EMC4 is a small alpha-helical transmembrane protein with 183 residues. It is relatively well annotated, localizes to the membrane of the ER and is implicated in apoptosis (61,62).

The dashboard view of PredictProtein reveals an N-terminal disordered region of ∼60 residues (Figure 1B) interrupted by a short beta-strand (residues 17–20). This mainly disordered region is followed by a region dominated by alpha-helices. In this region, two transmembrane helices are predicted. Note that mouse-over can reveal annotations. The lines below the predictions sketch proteins with similar sequence. EMC4 is highly conserved, and nearly identical proteins are found in several mammalian organisms. Interestingly, the heatmap of functional effects (SNAP2) shows that the beta-strand interrupting the N-terminal disordered region and the transmembrane helices are highly sensitive to point mutations (Figure 1C). LocTree3 and metastudent predictions, respectively, agree at high reliability with the experimental subcellular localization of EMC4 in the ER membrane and its function in apoptosis (61,62) (Figure 1D and E). Additionally, metastudent identifies ‘protein folding in endoplasmic reticulum’ as biological function (Figure 1D; directed graph of predicted GO terms in Supplementary Figure S1, Supporting Online Material). This has already been shown for the yeast EMC4 (63).

The EMC4 example shows how users could have suspected some of those findings that have been experimentally verified (transmembrane helices, apoptosis, ER localization). On the other hand, it also suggests additional insights that might trigger new experiments, e.g. the importance of the disordered N-terminus, and the importance of the beta-strand that breaks it. May be this will provide more detail on the suggested involvement in protein folding and in apoptosis (Figure 1D (62)).

CONCLUSION

Over its 22 year existence, the PredictProtein server has substantially expanded. What started as a service to annotate some aspects of protein structure (secondary structure, solvent accessibility and transmembrane helices) has evolved into a comprehensive suite of methods important for the prediction of protein structural and functional features. It provides a single-point access to many original important results. Our focus on making reliable methods available and our technical focus on keeping our server useful to the community have sustained many challenges in an environment of low funding, growing use and increasing data deluge. Yet we continue finding ways to present our results efficiently and without overloading users from a wide variety of backgrounds and needs. The results pages aspire to give visually intuitive, unified presentations for most of the structural and functional annotations. The PredictProtein web server can help when little is known about the protein in question. For medium-to-high throughput analyses, users will find the publicly available, downloadable software packages and the PPMI a suitable option. For approximately every second query, our PPcache repository provides results immediately.

We acknowledge all who have contributed ideas, methods and components, as well as those who tested and documented bugs and provided insight and advice. So many of you users out there: thanks! Please see the full list of contributors in Table S2, Supporting Online Material and on our website http://ppopen.rostlab.org/credits. Thanks also to the following ROSTLAB members for their help: Tim Karl for system maintenance, Milot Mirdita for helpful discussions, Marlena Drabik for handling administrative issues. Last, not least, thanks to all users who have been citing the usage of the service.

FUNDING

Alexander von Humboldt foundation through the German Ministry for Research and Education (BMBF: Bundesministerium fuer Bildung und Forschung).

Conflict of interest statement. None declared.

REFERENCES

UniProt Knowledgebase: a hub of integrated protein data

,

Database (Oxford)

,

2011

, vol.

2011

pg.

bar009

New in protein structure and function annotation: hotspots, single nucleotide polymorphisms and the ‘Deep Web’

,

Curr. Opin. Drug Discov. Devel.

,

2009

, vol.

12

(pg.

408

-

419

)

A series of PDB related databases for everyday needs

,

Nucleic Acids Res.

,

2011

, vol.

39

(pg.

D411

-

D419

)

Critical assessment of methods of protein structure prediction-Round VIII

,

Proteins

,

2009

, vol.

77

(pg.

1

-

4

)

Progress of 1D protein structure prediction at last

,

Proteins: Struct. Funct. Genet.

,

1995

, vol.

23

(pg.

295

-

300

)

Prediction of protein secondary structure at better than 70% accuracy

,

J. Mol. Biol.

,

1993

, vol.

232

(pg.

584

-

599

)

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs

,

Nucleic acids Res.

,

1997

, vol.

25

(pg.

3389

-

3402

)

Basic local alignment search tool

,

J Mol Biol.

,

1990

, vol.

215

(pg.

403

-

410

)

Protein secondary structure prediction based on position-specific scoring matrices

,

J. Mol. Biol.

,

1999

, vol.

292

(pg.

195

-

202

)

Alignments grow, secondary structure prediction improves

,

Proteins

,

2002

, vol.

46

(pg.

197

-

205

)

The Protein Data Bank

,

Nucleic Acids Res.

,

2000

, vol.

28

(pg.

235

-

242

)

Swiss-Prot: juggling between evolution and stability

,

Brief. Bioinform.

,

2004

, vol.

5

(pg.

39

-

55

)

New and continuing developments at PROSITE

,

Nucleic Acids Res.

,

2013

, vol.

41

(pg.

D344

-

D347

)

et al.

The Pfam protein families database

,

Nucleic Acids Res.

,

2012

, vol.

40

(pg.

D290

-

D301

)

PHD: predicting one-dimensional protein structure by profile-based neural networks

,

Methods Enzymol.

,

1996

, vol.

266

(pg.

525

-

539

)

Review: protein secondary structure prediction continues to rise

,

J. Struct. Biol.

,

2001

, vol.

134

(pg.

204

-

218

)

PROFtmb: a web server for predicting bacterial transmembrane beta barrel proteins

,

Nucleic Acids Res.

,

2006

, vol.

34

(pg.

W186

-

W188

)

Predicting coiled coils from protein sequences

,

Science

,

1991

, vol.

252

(pg.

1162

-

1164

)

DISULFIND: a disulfide bonding state and cysteine connectivity prediction server

,

Nucleic Acids Res.

,

2006

, vol.

34

(pg.

W177

-

W181

)

Analysis of compositionally biased regions in sequence databases

,

Methods Enzymol.

,

1996

, vol.

266

(pg.

554

-

571

)

Natively unstructured regions in proteins identified from contact predictions

,

Bioinformatics

,

2007

, vol.

23

(pg.

2376

-

2384

)

Natively unstructured loops differ from other loops

,

PLoS Comput. Biol.

,

2007

, vol.

3

pg.

e140

Protein flexibility and rigidity predicted from sequence

,

Proteins

,

2005

, vol.

61

(pg.

115

-

126

)

PROFbval: predict flexible and rigid residues in proteins

,

Bioinformatics

,

2006

, vol.

22

(pg.

891

-

893

)

Improved disorder prediction by combination of orthogonal approaches

,

PLoS One

,

2009

, vol.

4

pg.

e4433

ConSurf 2010: calculating evolutionary conservation in sequence and structure of proteins and nucleic acids

,

Nucleic Acids Res.

,

2010

, vol.

38

(pg.

W529

-

W533

)

ConSurf: using evolutionary data to raise testable hypotheses about protein function

,

Israel J. Chem.

,

2013

, vol.

53

(pg.

199

-

206

)

News from the protein mutability landscape

,

J. Mol. Biol.

,

2013

, vol.

425

(pg.

3937

-

3948

)

et al.

Homology-based inference sets the bar high for protein function prediction

,

BMC Bioinformatics

,

2013

, vol.

14

Suppl. 3

pg.

S7

doi:10.1186/1471-2105-14-S3-S7

LocTree3 prediction of localization

,

Nucleic Acids Res

,

2014

doi: 10.1093/nar/gku396

The PredictProtein server

,

Nucleic Acids Res.

,

2004

, vol.

32

(pg.

W321

-

W326

)

PDB_TM: selection and membrane localization of transmembrane proteins in the protein data bank

,

Nucleic Acids Res.

,

2005

, vol.

33

(pg.

D275

-

D278

)

OPM: orientations of proteins in membranes database

,

Bioinformatics

,

2006

, vol.

22

(pg.

623

-

625

)

SignalP 4.0: discriminating signal peptides from transmembrane regions

,

Nat. Methods

,

2011

, vol.

8

(pg.

785

-

786

)

An HMM posterior decoder for sequence feature prediction that includes homology information

,

Bioinformatics

,

2005

, vol.

21

Suppl. 1

(pg.

i251

-

i257

)

Improving the accuracy of transmembrane protein topology prediction using evolutionary information

,

Bioinformatics

,

2007

, vol.

23

(pg.

538

-

544

)

,

Technische Universität Muenchen (TUM)

,

2011

Munich, Germany

SNAP: predict effect of non-synonymous polymorphisms on function

,

Nucleic Acids Res.

,

2007

, vol.

35

(pg.

3823

-

3835

)

In silico mutagenesis: a case study of the melanocortin 4 receptor

,

Faseb J.

,

2009

, vol.

23

(pg.

3059

-

3069

)

HeatMapViewer:interactive display of 2D data in biology

,

F1000Research

,

2014

, vol.

3

doi:10.12688/f1000research.3-48.v1

LocTree2 predicts localization for all domains of life

,

Bioinformatics

,

2012

, vol.

28

(pg.

i458

-

i465

)

et al.

Gene ontology: tool for the unification of biology. The Gene Ontology Consortium

,

Nat. Genet.

,

2000

, vol.

25

(pg.

25

-

29

)

et al.

A large-scale evaluation of computational protein function prediction

,

Nat. Methods

,

2013

, vol.

10

(pg.

221

-

227

)

FFPred 2.0: improved homology-independent prediction of gene ontology terms for eukaryotic protein sequences

,

PLoS One

,

2013

, vol.

8

pg.

e63754

PROFcon: novel prediction of long-range contacts

,

Bioinformatics

,

2005

, vol.

21

(pg.

2960

-

2968

)

Loopy proteins appear conserved in evolution

,

J. Mol. Biol.

,

2002

, vol.

322

(pg.

53

-

64

)

NORSp: predictions of long regions without regular secondary structure

,

Nucleic Acids Res.

,

2003

, vol.

31

(pg.

3833

-

3835

)

IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content

,

Bioinformatics

,

2005

, vol.

21

(pg.

3433

-

3434

)

The DISOPRED server for the prediction of protein disorder

,

Bioinformatics

,

2004

, vol.

20

(pg.

2138

-

2139

)

Improved sequence-based prediction of disordered regions with multilayer fusion of multiple information sources

,

Bioinformatics

,

2010

, vol.

26

(pg.

i489

-

i496

)

et al.

DisProt: the database of disordered proteins

,

Nucleic Acids Res.

,

2007

, vol.

35

(pg.

D786

-

D793

)

ISIS: interaction sites identified from sequence

,

Bioinformatics

,

2007

, vol.

23

(pg.

e13

-

e16

)

Alternative protein-protein interfaces are frequent exceptions

,

PLoS Comput. Biol.

,

2012

, vol.

8

pg.

e1002623

,

Diploma thesis

,

2012

Munich, Germany

Technische Universität München

Automatic prediction of protein function

,

Cell. Mol. Life Sci.

,

2003

, vol.

60

(pg.

2637

-

2650

)

Comparison of site-specific rate-inference methods for protein sequences: empirical Bayesian methods are superior

,

Mol. Biol. Evol.

,

2004

, vol.

21

(pg.

1781

-

1791

)

Rate4Site: an algorithmic tool for the identification of functional regions in proteins by surface mapping of evolutionary determinants within their homologues

,

Bioinformatics

,

2002

, vol.

18

Suppl. 1

(pg.

S71

-

S77

)

et al.

BioJS: an open source JavaScript framework for biological data visualization

,

Bioinformatics

,

2013

, vol.

29

(pg.

1103

-

1104

)

Community-driven computational biology with Debian Linux

,

BMC Bioinformatics

,

2010

, vol.

11

Suppl. 12

pg.

S5

doi:10.1186/1471-2105-11-S12-S5

et al.

Cloud prediction of protein structure and function with PredictProtein for Debian

,

Biomed. Res. Int.

,

2013

, vol.

2013

pg.

398968

doi: 10.1155/2013/398968

Defining human ERAD networks through an integrative mapping strategy

,

Nat. Cell Biol.

,

2012

, vol.

14

(pg.

93

-

105

)

Transmembrane protein 85 from both human (TMEM85) and yeast (YGL231c) inhibit hydrogen peroxide mediated cell death in yeast

,

FEBS Lett.

,

2008

, vol.

582

(pg.

2637

-

2642

)

et al.

Comprehensive characterization of genes required for protein folding in the endoplasmic reticulum

,

Science

,

2009

, vol.

323

(pg.

1693

-

1697

)

© The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

Supplementary data

I agree to the terms and conditions. You must accept the terms and conditions.

Submit a comment

Name

Affiliations

Comment title

Comment

You have entered an invalid code

Thank you for submitting a comment on this article. Your comment will be reviewed and published at the journal's discretion. Please check for further notifications by email.

Citations

Views

Altmetric

Metrics

Total Views 8,683

6,935 Pageviews

1,748 PDF Downloads

Since 11/1/2016

Month: Total Views:
November 2016 2
December 2016 6
January 2017 18
February 2017 43
March 2017 76
April 2017 20
May 2017 49
June 2017 42
July 2017 30
August 2017 44
September 2017 43
October 2017 39
November 2017 30
December 2017 365
January 2018 88
February 2018 82
March 2018 112
April 2018 130
May 2018 117
June 2018 84
July 2018 76
August 2018 99
September 2018 104
October 2018 72
November 2018 90
December 2018 70
January 2019 86
February 2019 50
March 2019 137
April 2019 144
May 2019 113
June 2019 134
July 2019 153
August 2019 126
September 2019 119
October 2019 90
November 2019 82
December 2019 87
January 2020 102
February 2020 72
March 2020 75
April 2020 100
May 2020 82
June 2020 86
July 2020 113
August 2020 94
September 2020 112
October 2020 98
November 2020 133
December 2020 99
January 2021 118
February 2021 120
March 2021 124
April 2021 173
May 2021 90
June 2021 111
July 2021 97
August 2021 76
September 2021 99
October 2021 94
November 2021 81
December 2021 84
January 2022 76
February 2022 78
March 2022 101
April 2022 110
May 2022 147
June 2022 74
July 2022 90
August 2022 56
September 2022 109
October 2022 99
November 2022 78
December 2022 58
January 2023 84
February 2023 87
March 2023 104
April 2023 82
May 2023 102
June 2023 71
July 2023 66
August 2023 54
September 2023 67
October 2023 94
November 2023 76
December 2023 82
January 2024 83
February 2024 73
March 2024 62
April 2024 64
May 2024 85
June 2024 71
July 2024 92
August 2024 98
September 2024 106
October 2024 160
November 2024 59

Citations

446 Web of Science

×

Email alerts

Citing articles via

More from Oxford Academic