InterPro, progress and status in 2005 - PubMed (original) (raw)

. 2005 Jan 1;33(Database issue):D201-5.

doi: 10.1093/nar/gki106.

Rolf Apweiler, Teresa K Attwood, Amos Bairoch, Alex Bateman, David Binns, Paul Bradley, Peer Bork, Phillip Bucher, Lorenzo Cerutti, Richard Copley, Emmanuel Courcelle, Ujjwal Das, Richard Durbin, Wolfgang Fleischmann, Julian Gough, Daniel Haft, Nicola Harte, Nicolas Hulo, Daniel Kahn, Alexander Kanapin, Maria Krestyaninova, David Lonsdale, Rodrigo Lopez, Ivica Letunic, Martin Madera, John Maslen, Jennifer McDowall, Alex Mitchell, Anastasia N Nikolskaya, Sandra Orchard, Marco Pagni, Chris P Ponting, Emmanuel Quevillon, Jeremy Selengut, Christian J A Sigrist, Ville Silventoinen, David J Studholme, Robert Vaughan, Cathy H Wu

Affiliations

InterPro, progress and status in 2005

Nicola J Mulder et al. Nucleic Acids Res. 2005.

Abstract

InterPro, an integrated documentation resource of protein families, domains and functional sites, was created to integrate the major protein signature databases. Currently, it includes PROSITE, Pfam, PRINTS, ProDom, SMART, TIGRFAMs, PIRSF and SUPERFAMILY. Signatures are manually integrated into InterPro entries that are curated to provide biological and functional information. Annotation is provided in an abstract, Gene Ontology mapping and links to specialized databases. New features of InterPro include extended protein match views, taxonomic range information and protein 3D structure data. One of the new match views is the InterPro Domain Architecture view, which shows the domain composition of protein matches. Two new entry types were introduced to better describe InterPro entries: these are active site and binding site. PIRSF and the structure-based SUPERFAMILY are the latest member databases to join InterPro, and CATH and PANTHER are soon to be integrated. InterPro release 8.0 contains 11 007 entries, representing 2573 domains, 8166 families, 201 repeats, 26 active sites, 21 binding sites and 20 post-translational modification sites. InterPro covers over 78% of all proteins in the Swiss-Prot and TrEMBL components of UniProt. The database is available for text- and sequence-based searches via a webserver (http://www.ebi.ac.uk/interpro), and for download by anonymous FTP (ftp://ftp.ebi.ac.uk/pub/databases/interpro).

PubMed Disclaimer

Figures

Figure 1

Figure 1

Illustration of the detailed view for protein Q06124, the human protein-tyrosine phosphatase, non-receptor type 11. From an InterPro entry page, clicking on a protein accession number in the ‘Examples’ field takes you to this view for that protein. The oval shapes at the top of the figure display the InterPro Domain Architecture (IDA) view for this protein, which represents its domain composition. Each oval shape contains the domain name and the number of its iterations of the domain if greater than one. The InterPro detailed view represents the protein sequence as a series of different lines for each protein signature hit. The bars are colour coded according to the member database. A separate view below the signature matches displays the structural domains from the SCOP and CATH as white-striped bars. This view provides a complete picture of the protein domain composition and where sequence-based domains correspond to known structures.

Similar articles

Cited by

References

    1. Apweiler R., Attwood,T.K., Bairoch,A., Bateman,A., Birney,E., Biswas,M., Bucher,P., Cerutti,L., Corpet,F., Croning,M.D. et al. (2001) The InterPro database, an integrated documentation resource for protein families, domains and functional sites. Nucleic Acids Res., 29, 37–40. - PMC - PubMed
    1. Hulo N., Sigrist,C.J., Le Saux,V., Langendijk-Genevaux,P.S., Bordoli,L., Gattiker,A., De Castro,E., Bucher,P. and Bairoch,A. (2004) The PROSITE database, its status in 2002. Nucleic Acids Res., 30, 235–238. - PMC - PubMed
    1. Attwood T.K., Bradley,P., Flower,D.R., Gaulton,A., Maudling,N., Mitchell,A.L., Moulton,G., Nordle,A., Paine,K., Taylor,P., Uddin,A. and Zygouri,C. (2003) PRINTS and its automatic supplement pre-PRINTS. Nucleic Acids Res., 31, 400–402. - PMC - PubMed
    1. Servant F., Bru,C., Carrere,S., Courcelle,E., Gouzy,J., Peyruc,D. and Kahn,D. (2002) ProDom: automated clustering of homologous domains. Brief Bioinformatics, 3, 246–251. - PubMed
    1. Bateman A., Coin,L., Durbin,R., Finn,R.D., Hollich,V., Griffiths-Jones,S., Khanna,A., Marshall,M., Moxon,S., Sonnhammer,E.L., Studholme,D.J., Yeats,C. and Eddy,S.R. (2004) The Pfam protein families database. Nucleic Acids Res., 32, 138–141. - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources