The Universal Protein Resource (UniProt): an expanding universe of protein information - PubMed (original) (raw)
. 2006 Jan 1;34(Database issue):D187-91.
doi: 10.1093/nar/gkj161.
Rolf Apweiler, Amos Bairoch, Darren A Natale, Winona C Barker, Brigitte Boeckmann, Serenella Ferro, Elisabeth Gasteiger, Hongzhan Huang, Rodrigo Lopez, Michele Magrane, Maria J Martin, Raja Mazumder, Claire O'Donovan, Nicole Redaschi, Baris Suzek
Affiliations
- PMID: 16381842
- PMCID: PMC1347523
- DOI: 10.1093/nar/gkj161
The Universal Protein Resource (UniProt): an expanding universe of protein information
Cathy H Wu et al. Nucleic Acids Res. 2006.
Abstract
The Universal Protein Resource (UniProt) provides a central resource on protein sequences and functional annotation with three database components, each addressing a key need in protein bioinformatics. The UniProt Knowledgebase (UniProtKB), comprising the manually annotated UniProtKB/Swiss-Prot section and the automatically annotated UniProtKB/TrEMBL section, is the preeminent storehouse of protein annotation. The extensive cross-references, functional and feature annotations and literature-based evidence attribution enable scientists to analyse proteins and query across databases. The UniProt Reference Clusters (UniRef) speed similarity searches via sequence space compression by merging sequences that are 100% (UniRef100), 90% (UniRef90) or 50% (UniRef50) identical. Finally, the UniProt Archive (UniParc) stores all publicly available protein sequences, containing the history of sequence data with links to the source databases. UniProt databases continue to grow in size and in availability of information. Recent and upcoming changes to database contents, formats, controlled vocabularies and services are described. New download availability includes all major releases of UniProtKB, sequence collections by taxonomic division and complete proteomes. A bibliography mapping service has been added, and an ID mapping service will be available soon. UniProt databases can be accessed online at http://www.uniprot.org or downloaded at ftp://ftp.uniprot.org/pub/databases/.
Figures
Figure 1
Overview of the major data sources of the UniProt databases.
Similar articles
- UniProtKB/Swiss-Prot.
Boutet E, Lieberherr D, Tognolli M, Schneider M, Bairoch A. Boutet E, et al. Methods Mol Biol. 2007;406:89-112. doi: 10.1007/978-1-59745-535-0_4. Methods Mol Biol. 2007. PMID: 18287689 - UniProtKB/Swiss-Prot, the Manually Annotated Section of the UniProt KnowledgeBase: How to Use the Entry View.
Boutet E, Lieberherr D, Tognolli M, Schneider M, Bansal P, Bridge AJ, Poux S, Bougueleret L, Xenarios I. Boutet E, et al. Methods Mol Biol. 2016;1374:23-54. doi: 10.1007/978-1-4939-3167-5_2. Methods Mol Biol. 2016. PMID: 26519399 - The Universal Protein Resource (UniProt).
Bairoch A, Apweiler R, Wu CH, Barker WC, Boeckmann B, Ferro S, Gasteiger E, Huang H, Lopez R, Magrane M, Martin MJ, Natale DA, O'Donovan C, Redaschi N, Yeh LS. Bairoch A, et al. Nucleic Acids Res. 2005 Jan 1;33(Database issue):D154-9. doi: 10.1093/nar/gki070. Nucleic Acids Res. 2005. PMID: 15608167 Free PMC article. - Update on genome completion and annotations: Protein Information Resource.
Wu C, Nebert DW. Wu C, et al. Hum Genomics. 2004 Mar;1(3):229-33. doi: 10.1186/1479-7364-1-3-229. Hum Genomics. 2004. PMID: 15588483 Free PMC article. Review. - From protein sequences to 3D-structures and beyond: the example of the UniProt knowledgebase.
Hinz U; UniProt Consortium. Hinz U, et al. Cell Mol Life Sci. 2010 Apr;67(7):1049-64. doi: 10.1007/s00018-009-0229-6. Epub 2009 Dec 31. Cell Mol Life Sci. 2010. PMID: 20043185 Free PMC article. Review.
Cited by
- A Phylogenetic analysis of Heparanase (HPSE) gene.
Shaik AP, Alsaeed AH, Sultana A. Shaik AP, et al. Bioinformation. 2012;8(9):415-9. doi: 10.6026/97320630008415. Epub 2012 May 15. Bioinformation. 2012. PMID: 22715311 Free PMC article. - In vivo modification of tyrosine residues in recombinant mussel adhesive protein by tyrosinase co-expression in Escherichia coli.
Choi YS, Yang YJ, Yang B, Cha HJ. Choi YS, et al. Microb Cell Fact. 2012 Oct 24;11:139. doi: 10.1186/1475-2859-11-139. Microb Cell Fact. 2012. PMID: 23095646 Free PMC article. - Protein domain recurrence and order can enhance prediction of protein functions.
Messih MA, Chitale M, Bajic VB, Kihara D, Gao X. Messih MA, et al. Bioinformatics. 2012 Sep 15;28(18):i444-i450. doi: 10.1093/bioinformatics/bts398. Bioinformatics. 2012. PMID: 22962465 Free PMC article. - Utilization of heme as an iron source by marine Alphaproteobacteria in the Roseobacter clade.
Roe KL, Hogle SL, Barbeau KA. Roe KL, et al. Appl Environ Microbiol. 2013 Sep;79(18):5753-62. doi: 10.1128/AEM.01562-13. Epub 2013 Jul 19. Appl Environ Microbiol. 2013. PMID: 23872569 Free PMC article. - Immunogenicity of recombinant Mycobacterium bovis bacille Calmette-Guèrin clones expressing T and B cell epitopes of Mycobacterium tuberculosis antigens.
Mohamud R, Azlan M, Yero D, Alvarez N, Sarmiento ME, Acosta A, Norazmi MN. Mohamud R, et al. BMC Immunol. 2013;14 Suppl 1(Suppl 1):S5. doi: 10.1186/1471-2172-14-S1-S5. Epub 2013 Feb 25. BMC Immunol. 2013. PMID: 23458635 Free PMC article.
References
- Kretschmann E., Fleischmann W., Apweiler R. Automatic rule generation for protein annotation with the C4.5 data mining algorithm applied on SWISS-PROT. Bioinformatics. 2001;17:920–926. - PubMed
- Gattiker A., Michoud K., Rivoire C., Auchincloss A.H., Coudert E., Lima T., Kersey P., Pagni M., Sigrist C.J., Lachaize C., et al. Automated annotation of microbial proteomes in SWISS-PROT. Comput. Biol. Chem. 2003;27:49–58. - PubMed
- Wu C.H., Huang H., Yeh L.S., Barker W.C. Protein family classification and functional annotation. Comput. Biol. Chem. 2003;27:37–47. - PubMed
- Fleischmann W., Moller S., Gateau A., Apweiler R. A novel method for automatic functional annotation of proteins. Bioinformatics. 1999;15:228–233. - PubMed
- Holm L., Sander C. Dictionary of recurrent domains in protein structures. Proteins. 1998;33:88–96. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
- 1 U01 HG02712-01/HG/NHGRI NIH HHS/United States
- HHSN266200400061C/AI/NIAID NIH HHS/United States
- U01 HG002712/HG/NHGRI NIH HHS/United States
- HHSN266200400061C/HS/AHRQ HHS/United States
- 1R01HGO2273-01/HG/NHGRI NIH HHS/United States
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases