CASTp: computed atlas of surface topography of proteins with structural and topographical mapping of functionally annotated residues (original) (raw)

Journal Article

,

Program in Bioinformatics, Department of Bioengineering, University of Illinois at Chicago Chicago, IL 60612, USA

* To whom correspondence should be addressed. Tel: +1 312 355 1789; Fax: +1 312 413 2 18; Email: jliang@uic.edu

Search for other works by this author on:

,

Program in Bioinformatics, Department of Bioengineering, University of Illinois at Chicago Chicago, IL 60612, USA

Search for other works by this author on:

,

Program in Bioinformatics, Department of Bioengineering, University of Illinois at Chicago Chicago, IL 60612, USA

Search for other works by this author on:

,

Program in Bioinformatics, Department of Bioengineering, University of Illinois at Chicago Chicago, IL 60612, USA

Search for other works by this author on:

,

Program in Bioinformatics, Department of Bioengineering, University of Illinois at Chicago Chicago, IL 60612, USA

Search for other works by this author on:

Program in Bioinformatics, Department of Bioengineering, University of Illinois at Chicago Chicago, IL 60612, USA

Search for other works by this author on:

Received:

09 February 2006

Revision received:

04 March 2006

Cite

Joe Dundas, Zheng Ouyang, Jeffery Tseng, Andrew Binkowski, Yaron Turpaz, Jie Liang, CASTp: computed atlas of surface topography of proteins with structural and topographical mapping of functionally annotated residues, Nucleic Acids Research, Volume 34, Issue suppl_2, 1 July 2006, Pages W116–W118, https://doi.org/10.1093/nar/gkl282
Close

Navbar Search Filter Mobile Enter search term Search

Abstract

Cavities on a proteins surface as well as specific amino acid positioning within it create the physicochemical properties needed for a protein to perform its function. CASTp ( http://cast.engr.uic.edu ) is an online tool that locates and measures pockets and voids on 3D protein structures. This new version of CASTp includes annotated functional information of specific residues on the protein structure. The annotations are derived from the Protein Data Bank (PDB), Swiss-Prot, as well as Online Mendelian Inheritance in Man (OMIM), the latter contains information on the variant single nucleotide polymorphisms (SNPs) that are known to cause disease. These annotated residues are mapped to surface pockets, interior voids or other regions of the PDB structures. We use a semi-global pair-wise sequence alignment method to obtain sequence mapping between entries in Swiss-Prot, OMIM and entries in PDB. The updated CASTp web server can be used to study surface features, functional regions and specific roles of key residues of proteins.

INTRODUCTION

Characterizing protein functions is an increasingly important challenging problem that has been approached from both the sequence and structure levels. The fact that only 4922 of the 35 000 Protein Data Bank (PDB) ( 1 ) structures contain any type of functional annotation illustrates the widening gap between our ability to resolve the proteins structure and our ability to locate functionally important residues and to obtain a comprehensive understanding of the structural basis of protein function. The 3D structure of a protein and its surface topography can provide important information for understanding protein function, if a broad knowledge base of the functionally important residues and where they are located on the protein structures is provided. This update of the CASTp web server incorporates functional information about a large set of annotated residues on PDB structures obtained from annotations in PDB, Swiss-Prot and Online Mendelian Inheritance in Man (OMIM).

This paper is organized as follows. We will first discuss our method for mapping annotated residues from Swiss-Prot and OMIM onto the PDB structure. We will then describe updates to the CASTp ( 2 , 3 ) web server for visualization of the annotated functional residues, with emphasis on mapping to surface pockets and interior voids. We will conclude with description of additional updates to the CASTp web server.

MATERIALS AND METHODS

Swiss-Prot mapping method

The numbered positions of annotated residues in the Swiss-Prot sequence often do not align to the same numbered positions of the sequence from the PDB structure. Therefore, a mapping of positions between the Swiss-Prot sequence and the PDB sequence must be obtained. We use a variation of the Needleman and Wunsch algorithm to identify if a sequence of a PDB structure can be found to match the sequence containing annotated residues from the Swiss-Prot database.

Specifically, every Swiss-Prot sequence containing one or more annotated residues and a link to a PDB structure was aligned to the corresponding sequence of the PDB structure. Standard annotations of Swiss-Prot used include post-translational modifications (MOD_RES), covalent binding of a lipid moiety (LIPID), glycosylation sites (CARBOHYD), post-translational formed amino acid bonds (CROSSLNK), metal binding sites (METAL), chemical group binding sites (BINDING), calcium binding regions (CA_BIND), DNA binding regions (DNA_BIND), nucleotide phosphate binding regions (NP_BIND), zinc finger regions (ZN_FING), enzyme activity amino acids (ACT_SITE) and any interesting single amino acid site (SITE). To ensure that the mapping is accurate, only alignments of two sequences with a sequence identity greater than ninety five percent were used. The annotated positions from Swiss-Prot are then transferred onto the PDB sequence, as long as the position is not aligned to a gap.

OMIM mapping method

Variant alleles that are known to be disease causing and are SNPs were selected from the OMIM ( 4 ). These OMIM entries that contain links to Swiss-Prot database were mapped onto the Swiss-Prot ( 5 ) sequence by measuring the relative distances in residue position between the OMIM alleles and then identifying the corresponding pairs of SNPs in the Swiss-Prot entry. If the Swiss-Prot entry identified the corresponding PDB entry, the sequence was extracted and aligned to the PDB structure using a semi-global pair-wise sequence alignment method. We follow Stitziel et al . ( 6 , 7 ) for the mapping between OMIM and PDB entries.

RESULTS

Mapping results

There are 113 928 annotated residues in 4, 922 structures labeled in PDB records. The transfer of 241 913 Swiss-Prot annotations added 226 177 unique annotations to 15 913 PDB structures. Of those structures, 13 094 did not previously have any annotation contained in the PDB records. Table 1 lists the type of Swiss-Prot annotations, number of PDB structures the annotation is found in, and the total number of annotated residues. Of the 15 661 BINDING residues, we were able to map 11 407 (81%) of them to a pocket or a void on the protein structure. We were also able to map 14 829 (74%) of the ACT_SITE sites of enzymes to an existing protein pocket. Additional computation can further raise these percentages (data not shown).

From the original set of 5467 nsSNPs in 1061 alleles, the mapping of OMIM disease mutations added 2128 annotated residues on 310 PDB structures. Of those 2128 variants, only 254 are mapped onto an annotation from either PDB or Swiss-Prot. This is reasonable, as it is possible that these mutations in some cases cause disease by disrupting the proteins structural stability rather than interrupting their functional interactions with other molecules. The database of all annotated residues from PDB, Swiss-Prot and OMIM can be downloaded from the CASTp web server.

Visualizing annotated residues in CASTp

In addition to file downloads, CASTp allows for interactive visualization of biologically important annotated residues by querying the CASTp server using a four letter PDB protein name, Swiss-Prot or GenBank identification. A new database of CASTp calculations of single chains of a multiple chain complex can also be queried by adding the chain identifier to the PDB protein name. Figure 1 shows the atoms of the charge relay system that resides in a functional pocket of serine protease/inhibitor (PDB 1a2c). The atoms of annotated residues that lie in the pocket are highlighted in red in contrast to the green pocket atoms. A table of all the annotated residues are also displayed on the right hand side of the browser window. This table reports the following information: the database from which the annotation was derived from, the annotation key word from the database, the position of the annotation on the sequence of the PDB structure, the three letter amino acid code of the annotated residue, the identifications of the pocket/pockets the annotated residue is located and a brief description of the annotation. If the user chooses to have the results emailed, a text file will be sent that contains all the information listed in the above table.

Calculation requests

In addition to querying a database of single chain calculations, the ‘Calculation Request’ page allows the user to run a calculation on any combination of chains from a multiple chain complex. If the protein contains HET groups, the user is also given the option to include any combination of the HET groups in the calculation.

Improved visualization

For visualizing annotated residues, the JMOL plug-in ( http://www.jmol.org ) is now added as a visualization option. JMOL runs on Windows/Mac OS X/Linux and only requires a java enabled browser. The result is added functionality and a friendlier user interface.

The user is now also presented with a corresponding sequence map, where residues in highlighted pocket are highlighted in the same color as in the structural visualization. In addition, a user has finer control. The user is able to change the pocket colorings, the display of the PDB structure in wireframe, cartoon, strands or ribbons. The user can also send customized rasmol scripts to the Chime visualization.

DISCUSSION

This paper describes major updates to the CASTp web server. Biologically important functional residues annotated from three sources are now mapped to PDB structures and visualization is provided. We believe these updates significantly increases the information content of CASTp and enhances our knowledge base needed for studying structural basis of protein functions.

AVAILABILITY

CASTp web server and the associated mapping database can be freely accessed on the World Wide Web at http://cast.engr.uic.edu .

Table 1

Statistics of the Swiss-Prot annotated residues

Swiss-Prot key #PDB #Residues
ACT_SITE 6871 20 121
METAL 5014 37 824
BINDING 3199 13 987
CARBOHYD 2620 10 266
MOD_RES 2606 6556
SITE 1993 8003
NP_BIND 1748 58 777
DNA_BIND 464 33 978
CA_BIND 358 16 413
ZN_FING 295 19 273
CROSSLNK 230 467
LIPID 187 312
Swiss-Prot key #PDB #Residues
ACT_SITE 6871 20 121
METAL 5014 37 824
BINDING 3199 13 987
CARBOHYD 2620 10 266
MOD_RES 2606 6556
SITE 1993 8003
NP_BIND 1748 58 777
DNA_BIND 464 33 978
CA_BIND 358 16 413
ZN_FING 295 19 273
CROSSLNK 230 467
LIPID 187 312

Column 1 reports the Swiss-Prot site key, column 2 lists the number of PDB structures the site was mapped to and column 3 lists the number of unique residues that were mapped to PDB structures.

Table 1

Statistics of the Swiss-Prot annotated residues

Swiss-Prot key #PDB #Residues
ACT_SITE 6871 20 121
METAL 5014 37 824
BINDING 3199 13 987
CARBOHYD 2620 10 266
MOD_RES 2606 6556
SITE 1993 8003
NP_BIND 1748 58 777
DNA_BIND 464 33 978
CA_BIND 358 16 413
ZN_FING 295 19 273
CROSSLNK 230 467
LIPID 187 312
Swiss-Prot key #PDB #Residues
ACT_SITE 6871 20 121
METAL 5014 37 824
BINDING 3199 13 987
CARBOHYD 2620 10 266
MOD_RES 2606 6556
SITE 1993 8003
NP_BIND 1748 58 777
DNA_BIND 464 33 978
CA_BIND 358 16 413
ZN_FING 295 19 273
CROSSLNK 230 467
LIPID 187 312

Column 1 reports the Swiss-Prot site key, column 2 lists the number of PDB structures the site was mapped to and column 3 lists the number of unique residues that were mapped to PDB structures.

Chime visualization of serine protease/inhibitor (PDB 1a2c) showing atoms from residues in the functional pocket important for the charge relay system in red.

Figure 1

Chime visualization of serine protease/inhibitor (PDB 1a2c) showing atoms from residues in the functional pocket important for the charge relay system in red.

Present addresses: Andrew Binkowski, Argonne National Laboratories, Argonne, IL 60439 USA

Yaron Turpaz, Affymetrix, Inc., Santa Clara, CA 95051, USA

Funding to pay the Open Access publication charges for this article was provided by grants from National Science Foundation (CAREER DBI0133856), National Institute of Health (GM68958),and Office of Naval Research (N00014-06-1-0100).

Conflict of interest statement . None declared.

REFERENCES

1

Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.

2000

The Protein Data Bank

Nucleic Acids Res

.

28

235

–242

2

Binkowski, T.A., Naghibzadeh, S., Liang, J.

2003

CASTp: computed atlas of surface topography of proteins

Nucleic Acids Res

.

31

3352

–3355

3

Liang, J., Edelsbrunner, H., Woodward, C.

1998

Anatomy of protein pockets and cavities: measurement of binding site geometry and implications for ligand design

Protein Sci

.

7

1884

–1897

4

McKusick, V.A.

Mendelian Inheritance in Man. A Catalog of Human Genes and Genetic Disorders, 12th edn

1998

Baltimore Johns Hopkins University Press

5

Gasteiger, E., Gattiker, A., Hoogland, C., Ivanyi, I., Appel, R.D., Bairoch, A.

2003

ExPASy: the proteomics server for in-depth protein knowledge and analysis

Nucleic Acids Res

.

31

3784

–3788

6

Stitziel, N., Tseng, Y.Y., Pervouchine, D., Goddeau, D., Kasif, S., Liang, J.

2003

Structural location of disease-associated single-nucleotide polymorphisms

JMB

327

1021

–1030

7

Stitziel, N., Binkowski, T.A., Tseng, Y.Y., Kasif, S., Liang, J.

2004

topoSNP: a topographic database of non-synonymous single nucleotide polymorphisms with and without known disease association

Nucleic Acids Res

.

32

D520

–D522

© The Author 2006. Published by Oxford University Press. All rights reserved The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article for non-commercial purposes provided that: the original authorship is properly and fully attributed; the Journal and Oxford University Press are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. For commercial re-use, please contact journals.permissions@oxfordjournals.org

I agree to the terms and conditions. You must accept the terms and conditions.

Submit a comment

Name

Affiliations

Comment title

Comment

You have entered an invalid code

Thank you for submitting a comment on this article. Your comment will be reviewed and published at the journal's discretion. Please check for further notifications by email.

Citations

Views

Altmetric

Metrics

Total Views 5,876

4,402 Pageviews

1,474 PDF Downloads

Since 11/1/2016

Month: Total Views:
November 2016 3
December 2016 9
January 2017 8
February 2017 36
March 2017 41
April 2017 18
May 2017 19
June 2017 26
July 2017 17
August 2017 20
September 2017 18
October 2017 18
November 2017 21
December 2017 67
January 2018 60
February 2018 82
March 2018 69
April 2018 67
May 2018 71
June 2018 79
July 2018 100
August 2018 99
September 2018 47
October 2018 70
November 2018 58
December 2018 62
January 2019 49
February 2019 44
March 2019 56
April 2019 61
May 2019 64
June 2019 53
July 2019 64
August 2019 50
September 2019 51
October 2019 51
November 2019 51
December 2019 24
January 2020 37
February 2020 36
March 2020 49
April 2020 29
May 2020 33
June 2020 45
July 2020 35
August 2020 74
September 2020 55
October 2020 35
November 2020 56
December 2020 44
January 2021 46
February 2021 35
March 2021 51
April 2021 75
May 2021 57
June 2021 56
July 2021 83
August 2021 48
September 2021 49
October 2021 117
November 2021 97
December 2021 49
January 2022 34
February 2022 46
March 2022 71
April 2022 91
May 2022 97
June 2022 78
July 2022 78
August 2022 98
September 2022 99
October 2022 89
November 2022 78
December 2022 94
January 2023 92
February 2023 79
March 2023 71
April 2023 91
May 2023 79
June 2023 116
July 2023 91
August 2023 75
September 2023 98
October 2023 129
November 2023 75
December 2023 126
January 2024 89
February 2024 97
March 2024 70
April 2024 96
May 2024 48
June 2024 65
July 2024 82
August 2024 68
September 2024 76
October 2024 16

×

Email alerts

Citing articles via

More from Oxford Academic