NOPdb: Nucleolar Proteome Database—2008 update (original) (raw)
Journal Article
,
1Wellcome Trust Centre for Gene Regulation & Expression and 2School of Computing, University of Dundee, Dundee DD1 5EH, UK
Search for other works by this author on:
,
1Wellcome Trust Centre for Gene Regulation & Expression and 2School of Computing, University of Dundee, Dundee DD1 5EH, UK
Search for other works by this author on:
,
1Wellcome Trust Centre for Gene Regulation & Expression and 2School of Computing, University of Dundee, Dundee DD1 5EH, UK
Search for other works by this author on:
,
1Wellcome Trust Centre for Gene Regulation & Expression and 2School of Computing, University of Dundee, Dundee DD1 5EH, UK
Search for other works by this author on:
1Wellcome Trust Centre for Gene Regulation & Expression and 2School of Computing, University of Dundee, Dundee DD1 5EH, UK
Search for other works by this author on:
Received:
12 September 2008
Revision received:
10 October 2008
Accepted:
10 October 2008
Published:
04 November 2008
Cite
Yasmeen Ahmad, François-Michel Boisvert, Peter Gregor, Andy Cobley, Angus I. Lamond, NOPdb: Nucleolar Proteome Database—2008 update, Nucleic Acids Research, Volume 37, Issue suppl_1, 1 January 2009, Pages D181–D184, https://doi.org/10.1093/nar/gkn804
Close
Navbar Search Filter Mobile Enter search term Search
Abstract
An experimental data handling system has been created as an update to the previous Nucleolar Proteome Database (NOPdb3.0: http://www.lamondlab.com/NOPdb3.0/). This updated system is able to manage large data sets identified by multiple mass spectrometry and has been used to analyse highly purified preparations of human nucleoli from different cell lines. The newly created application includes a dynamic relational database, which is kept up to date by laboratory staff. The data are further annotated with information from specific external sources on the web, including the IPI and Gene Ontology databases. In addition, an Application Programming Interface provides external users with a portal to link into the nucleolar proteome database and hence, gain access to continually updated results. From the initial ∼700 human proteins identified in the previous iteration of the NOPdb, we have now identified over 50 000 peptides contained in over 4500 human proteins from purified nucleoli, providing enhanced coverage of the nucleolar proteome.
INTRODUCTION
The nucleolus is a highly conserved nuclear organelle whose main function is to coordinate the synthesis and assembly of ribosome subunits (1). Previously we described a Nucleolar Proteome Database (NOPdb2.0: http://www.lamondlab.com/NOPdb) that archived data on >700 proteins that were identified by multiple mass spectrometry (MS) analyses from highly purified preparations of human nucleoli (2). Each protein entry was annotated with information about its corresponding gene, its domain structures and relevant protein homologues across species, as well as documenting its MS identification history, including all the peptides sequenced by tandem MS/MS. Moreover, data showing the quantitative changes in the relative levels of approximately 500 nucleolar proteins were compared at different time points upon transcriptional inhibition (3).
The data presented by the previous NOPdb, version 2.0, was held in a flat file database. Due to the aggregated nature of the data, results from individual experiments could not be extracted. The peptide data for a single protein were merged within this database rather than stored separately. The client interface to this database consisted of Perl CGI scripts. These scripts were able to extract the relevant data from the flat file database to create static html pages. After running the scripts, a page was created on the server for each protein. The html pages were then made available to the global community via the internet. Each time data were updated in the flat files, the Perl scripts had to be run again in order to reproduce the static html pages. This process of having to reproduce the static html protein pages after each database update was highly inefficient and time consuming. A more efficient approach is to produce dynamic html pages upon user request. Furthermore, the capabilities of the version 2.0 NOPdb database were limited with respect to security, ease of use, accessibility, maintainability and expandability. For example, a number of security concerns arose regarding the Perl scripts, which with limited documentation, proved very difficult to resolve.
The new version of the NOPdb3.0 (http://www.lamondlab.com/NOPdb3.0/) consists of a unique, secure, extendable content management system, holding advanced nucleolar proteomics data. The created application includes a dynamic relational database, which is kept up to date by members of the Lamond group. It also allows the query of protein data hosted within the database by external users, either using the custom built interface provided by the Lamond group, or by building custom web tools that access data via the Application Programming Interface (API). In addition to the dynamic interfaces provided by the new content management system, the data included in the nucleolar proteome are also dynamically updated with proteins identified from several different cell lines, using various instruments by members of the laboratory. From the initial ∼700 proteins identified in the previous iteration of the NOPdb, we have now identified over 50 000 peptides contained in over 4500 human proteins from purified nucleoli, providing significantly enhanced coverage of the nucleolar proteome.
DATABASE ACCESS
We have established the new version of the Nucleolar Proteome Database (NOPdb3.0), which archives all the human nucleolar proteins identified to date by the Lamond group and their collaborators using MS analyses (1–3). This current version 3.0 of the database is available at http://www.lamondlab.com/NOPdb3.0/ and is searchable either by protein name, protein sequence, motif (4–6), Gene Ontology (GO) (7) terms or by setting the range of the predicted isoelectric point and/or molecular weight (Figure 1). To date, NOPdb3.0 archives over 4500 human nucleolar proteins verified by multiple MS analyses in different cell lines. The NOPdb3.0 provides information on multiple parameters, including protein name, accession number, gene symbol, gene name, sequence, molecular weight, isoelectric point (pI), peptides identified, experiments in which the protein was identified, motifs and GO annotation. The previous version of the database (2) will still be available through our website at http://www.lamondlab.com/NOPdb/.
Figure 1.
Snapshots of the NOPdb3.0 (http://www.lamondlab.com/NOPdb3.0/). For illustration, the database was searched to identify a Protein Phosphatase 1 (PP1) isoform and here we show an overview page for this protein documenting its sequence, peptides identified, etc.
DATABASE IMPLEMENTATION
The new NOPdb3.0 application consists of a multi-tier architecture, where the data storage, business logic and client interface are separate components. The data storage is implemented via a relational mySQL database. The database is structured (Figure 2) to allow easy extendibility and maintenance in the future. In order to extract useful data, the business logic employs complex SQL queries. The purpose of the business logic layer is to act as an interface between the client-side application and database. The business logic and client interface can both reside on any Apache web server capable of serving PHP classes and the client interface, which is built in Adobe Flex. Adobe Flex was chosen as it allows Rich Internet Applications (RIAs) to be prototyped and developed rapidly, with the end product running across a wide range of client browsers.
Figure 2.
Entity Relationship (ER) diagram depicting the relationships between the tables present in the mySQL database implemented for the NOPdb3.0 Content Management System.
Version 3.0 of the NOPdb is an entirely new implementation using a fully relational design with major improvements over previous versions and additional functionality. The newly created database holds data of higher granularity, storing data at the peptide level as opposed to collated data on proteins. This higher granularity also means that results from new experiments can be directly uploaded to the database without prior processing, as the direct output from MS-based proteomics analyses is peptide data. The application has the ability to interpret data and therefore aggregate it to provide metadata for proteins on a usable, graphical interface. The structure of the application has been designed using the model view controller design pattern (8), thus meaning that the functionality is separated from the overall look and feel of the application to ensure a more customisable solution. All communication between the database and application has been implemented to pass through the custom made API (9). Furthermore, in this new version 3.0 application, the graphical user interface to the database is able to create data pages ‘on the fly’ using the custom API rather than serving static data pages, as in previous versions. This API not only acts as a security blanket around the database, it also provides the ability for users to create their own websites and/or applications that represent the data being stored in the proteomics database. External users can make use of the API through the REST (Representational State Transfer) (10) approach. Hence, external programmers can retrieve content in XML (Extensible Markup Language) format, from the database, by accessing well-documented Uniform Resource Locators (URLs).
The application also facilitates mining of stored data, with data being stored in a relational structure that is well documented. Thus tools can be built to search, analyse, read and understand the data. This mining capability is evident within the application, with the database being searchable by multiple parameters, including gene names, amino acid or nucleotide sequences, sequence motifs, or by limiting the range for isoelectric points and/or molecular weights. The database is also searchable by Interpro motif numbers (database of protein families, domains and functional sites) (4–6) and by GO terms (describe gene products in terms of their associated biological processes, cellular components and molecular functions in a species-independent manner) (7). Furthermore, the NOPdb3.0 application uses the API to create dynamically generated graphs, allowing the users to visualise the data produced from experiments and enabling cross analysis between experiments.
Increased security was a core focus of this development. The application itself is designed with three levels of access, to facilitate management and to prevent unauthorised use of the system. Users are provided with different levels of access according to their needs, which are seamlessly enforced by the application. This security ensures that the data remain accurate and the quality of the data is not compromised. Furthermore, this application creates a platform for the Lamond group to share their data with the wider cell biology community.
DATABASE CONTENT
The database has been populated with different sets of experiments, performed in the Lamond laboratory, that identify proteins in purified preparations of human nucleoli. This new NOPdb3.0 now contains over 4500 proteins identified in different human cells lines. The increased coverage of the human nucleolus proteome is illustrated by the fact that NOPdb3.0 now includes over 80% of ribosomal proteins, as opposed to the ∼28% described in NOPdb version 2.0. We estimate that NOPdb3.0 contains over 80% of the main human nucleolus proteins. The proteins in the database will be regularly updated as more experiments are performed in the Lamond laboratory.
FUNDING
This work was supported by a Wellcome Trust Programme Grant (073980/Z/03/Z) and by an interdisciplinary RASOR (Radical Solutions for Researching the Proteome) initiative, which is supported by the Biotechnology and Biological Sciences Research Council, Engineering and Physical Sciences Research Council, Scottish Higher Education Funding Council and Medical Research Council. A.I.L. is a Wellcome Trust Principal Research Fellow. Caledonian Research Foundation Fellowship (to F.M.B.). BBSRC PhD studentship (to Y.A.). Funding for open access charge: Wellcome Trust.
Conflict of interest statement. None declared.
ACKNOWLEDGEMENTS
We would like to thank Drs Douglas Lamont and Kenneth Beattie of the Fingerprints Proteomics Facility at the University of Dundee (http://proteomics.lifesci.dundee.ac.uk/) for technical assistance.
REFERENCES
1
The multifunctional nucleolus
,
Nat. Rev. Mol. Cell. Biol.
,
2007
, vol.
8
(pg.
574
-
585
)
2
NOPdb: Nucleolar Proteome Database
,
Nucleic Acids Res.
,
2006
, vol.
34
(pg.
D218
-
D220
)
3
Nucleolar proteome dynamics
,
Nature
,
2005
, vol.
433
(pg.
77
-
83
)
4
et al.
The InterPro Database, 2003 brings increased coverage and new features
,
Nucleic Acids Res.
,
2003
, vol.
31
(pg.
315
-
318
)
5
et al.
The Pfam protein families database
,
Nucleic Acids Res.
,
2004
, vol.
32
(pg.
D138
-
D141
)
6
SMART 4.0: towards genomic data integration
,
Nucleic Acids Res.
,
2004
, vol.
32
(pg.
D142
-
D144
)
7
et al.
Gene ontology: tool for the unification of biology. The Gene Ontology Consortium
,
Nat. Genet.
,
2000
, vol.
25
(pg.
25
-
29
)
8
Architectural design of modern web applications
,
Found. Comput. Decision Sci.
,
2005
, vol.
30
(pg.
49
-
60
)
9
API: design matters
,
ACM Queue
,
2007
, vol.
5
(pg.
4
-
14
)
10
Principled design of the modern web architecture
,
ACM Trans. Internet Technol.
,
2002
, vol.
2
(pg.
115
-
150
)
© 2008 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
I agree to the terms and conditions. You must accept the terms and conditions.
Submit a comment
Name
Affiliations
Comment title
Comment
You have entered an invalid code
Thank you for submitting a comment on this article. Your comment will be reviewed and published at the journal's discretion. Please check for further notifications by email.
Citations
Views
Altmetric
Metrics
Total Views 2,899
2,200 Pageviews
699 PDF Downloads
Since 11/1/2016
Month: | Total Views: |
---|---|
November 2016 | 1 |
December 2016 | 4 |
January 2017 | 12 |
February 2017 | 23 |
March 2017 | 23 |
April 2017 | 27 |
May 2017 | 17 |
June 2017 | 20 |
July 2017 | 21 |
August 2017 | 15 |
September 2017 | 35 |
October 2017 | 27 |
November 2017 | 5 |
December 2017 | 17 |
January 2018 | 39 |
February 2018 | 31 |
March 2018 | 38 |
April 2018 | 35 |
May 2018 | 54 |
June 2018 | 36 |
July 2018 | 43 |
August 2018 | 30 |
September 2018 | 12 |
October 2018 | 26 |
November 2018 | 65 |
December 2018 | 35 |
January 2019 | 38 |
February 2019 | 47 |
March 2019 | 49 |
April 2019 | 55 |
May 2019 | 40 |
June 2019 | 40 |
July 2019 | 54 |
August 2019 | 49 |
September 2019 | 34 |
October 2019 | 58 |
November 2019 | 21 |
December 2019 | 29 |
January 2020 | 29 |
February 2020 | 21 |
March 2020 | 20 |
April 2020 | 26 |
May 2020 | 19 |
June 2020 | 29 |
July 2020 | 29 |
August 2020 | 37 |
September 2020 | 11 |
October 2020 | 33 |
November 2020 | 23 |
December 2020 | 23 |
January 2021 | 31 |
February 2021 | 22 |
March 2021 | 45 |
April 2021 | 25 |
May 2021 | 39 |
June 2021 | 15 |
July 2021 | 19 |
August 2021 | 14 |
September 2021 | 36 |
October 2021 | 26 |
November 2021 | 22 |
December 2021 | 23 |
January 2022 | 15 |
February 2022 | 31 |
March 2022 | 45 |
April 2022 | 21 |
May 2022 | 32 |
June 2022 | 25 |
July 2022 | 13 |
August 2022 | 44 |
September 2022 | 23 |
October 2022 | 32 |
November 2022 | 24 |
December 2022 | 47 |
January 2023 | 25 |
February 2023 | 33 |
March 2023 | 30 |
April 2023 | 28 |
May 2023 | 23 |
June 2023 | 15 |
July 2023 | 22 |
August 2023 | 25 |
September 2023 | 26 |
October 2023 | 35 |
November 2023 | 21 |
December 2023 | 50 |
January 2024 | 25 |
February 2024 | 42 |
March 2024 | 43 |
April 2024 | 34 |
May 2024 | 40 |
June 2024 | 30 |
July 2024 | 40 |
August 2024 | 40 |
September 2024 | 57 |
October 2024 | 41 |
Citations
224 Web of Science
×
Email alerts
Citing articles via
More from Oxford Academic