NOPdb: Nucleolar Proteome Database—2008 update (original) (raw)

Journal Article

1Wellcome Trust Centre for Gene Regulation & Expression and 2School of Computing, University of Dundee, Dundee DD1 5EH, UK

Search for other works by this author on:

1Wellcome Trust Centre for Gene Regulation & Expression and 2School of Computing, University of Dundee, Dundee DD1 5EH, UK

Search for other works by this author on:

1Wellcome Trust Centre for Gene Regulation & Expression and 2School of Computing, University of Dundee, Dundee DD1 5EH, UK

Search for other works by this author on:

1Wellcome Trust Centre for Gene Regulation & Expression and 2School of Computing, University of Dundee, Dundee DD1 5EH, UK

Search for other works by this author on:

1Wellcome Trust Centre for Gene Regulation & Expression and 2School of Computing, University of Dundee, Dundee DD1 5EH, UK

Search for other works by this author on:

Received:

12 September 2008

Revision received:

10 October 2008

Accepted:

10 October 2008

Published:

04 November 2008

Cite

Yasmeen Ahmad, François-Michel Boisvert, Peter Gregor, Andy Cobley, Angus I. Lamond, NOPdb: Nucleolar Proteome Database—2008 update, Nucleic Acids Research, Volume 37, Issue suppl_1, 1 January 2009, Pages D181–D184, https://doi.org/10.1093/nar/gkn804
Close

Navbar Search Filter Mobile Enter search term Search

Abstract

An experimental data handling system has been created as an update to the previous Nucleolar Proteome Database (NOPdb3.0: http://www.lamondlab.com/NOPdb3.0/). This updated system is able to manage large data sets identified by multiple mass spectrometry and has been used to analyse highly purified preparations of human nucleoli from different cell lines. The newly created application includes a dynamic relational database, which is kept up to date by laboratory staff. The data are further annotated with information from specific external sources on the web, including the IPI and Gene Ontology databases. In addition, an Application Programming Interface provides external users with a portal to link into the nucleolar proteome database and hence, gain access to continually updated results. From the initial ∼700 human proteins identified in the previous iteration of the NOPdb, we have now identified over 50 000 peptides contained in over 4500 human proteins from purified nucleoli, providing enhanced coverage of the nucleolar proteome.

INTRODUCTION

The nucleolus is a highly conserved nuclear organelle whose main function is to coordinate the synthesis and assembly of ribosome subunits (1). Previously we described a Nucleolar Proteome Database (NOPdb2.0: http://www.lamondlab.com/NOPdb) that archived data on >700 proteins that were identified by multiple mass spectrometry (MS) analyses from highly purified preparations of human nucleoli (2). Each protein entry was annotated with information about its corresponding gene, its domain structures and relevant protein homologues across species, as well as documenting its MS identification history, including all the peptides sequenced by tandem MS/MS. Moreover, data showing the quantitative changes in the relative levels of approximately 500 nucleolar proteins were compared at different time points upon transcriptional inhibition (3).

The data presented by the previous NOPdb, version 2.0, was held in a flat file database. Due to the aggregated nature of the data, results from individual experiments could not be extracted. The peptide data for a single protein were merged within this database rather than stored separately. The client interface to this database consisted of Perl CGI scripts. These scripts were able to extract the relevant data from the flat file database to create static html pages. After running the scripts, a page was created on the server for each protein. The html pages were then made available to the global community via the internet. Each time data were updated in the flat files, the Perl scripts had to be run again in order to reproduce the static html pages. This process of having to reproduce the static html protein pages after each database update was highly inefficient and time consuming. A more efficient approach is to produce dynamic html pages upon user request. Furthermore, the capabilities of the version 2.0 NOPdb database were limited with respect to security, ease of use, accessibility, maintainability and expandability. For example, a number of security concerns arose regarding the Perl scripts, which with limited documentation, proved very difficult to resolve.

The new version of the NOPdb3.0 (http://www.lamondlab.com/NOPdb3.0/) consists of a unique, secure, extendable content management system, holding advanced nucleolar proteomics data. The created application includes a dynamic relational database, which is kept up to date by members of the Lamond group. It also allows the query of protein data hosted within the database by external users, either using the custom built interface provided by the Lamond group, or by building custom web tools that access data via the Application Programming Interface (API). In addition to the dynamic interfaces provided by the new content management system, the data included in the nucleolar proteome are also dynamically updated with proteins identified from several different cell lines, using various instruments by members of the laboratory. From the initial ∼700 proteins identified in the previous iteration of the NOPdb, we have now identified over 50 000 peptides contained in over 4500 human proteins from purified nucleoli, providing significantly enhanced coverage of the nucleolar proteome.

DATABASE ACCESS

We have established the new version of the Nucleolar Proteome Database (NOPdb3.0), which archives all the human nucleolar proteins identified to date by the Lamond group and their collaborators using MS analyses (1–3). This current version 3.0 of the database is available at http://www.lamondlab.com/NOPdb3.0/ and is searchable either by protein name, protein sequence, motif (4–6), Gene Ontology (GO) (7) terms or by setting the range of the predicted isoelectric point and/or molecular weight (Figure 1). To date, NOPdb3.0 archives over 4500 human nucleolar proteins verified by multiple MS analyses in different cell lines. The NOPdb3.0 provides information on multiple parameters, including protein name, accession number, gene symbol, gene name, sequence, molecular weight, isoelectric point (pI), peptides identified, experiments in which the protein was identified, motifs and GO annotation. The previous version of the database (2) will still be available through our website at http://www.lamondlab.com/NOPdb/.

Snapshots of the NOPdb3.0 (http://www.lamondlab.com/NOPdb3.0/). For illustration, the database was searched to identify a Protein Phosphatase 1 (PP1) isoform and here we show an overview page for this protein documenting its sequence, peptides identified, etc.

Figure 1.

DATABASE IMPLEMENTATION

The new NOPdb3.0 application consists of a multi-tier architecture, where the data storage, business logic and client interface are separate components. The data storage is implemented via a relational mySQL database. The database is structured (Figure 2) to allow easy extendibility and maintenance in the future. In order to extract useful data, the business logic employs complex SQL queries. The purpose of the business logic layer is to act as an interface between the client-side application and database. The business logic and client interface can both reside on any Apache web server capable of serving PHP classes and the client interface, which is built in Adobe Flex. Adobe Flex was chosen as it allows Rich Internet Applications (RIAs) to be prototyped and developed rapidly, with the end product running across a wide range of client browsers.

Entity Relationship (ER) diagram depicting the relationships between the tables present in the mySQL database implemented for the NOPdb3.0 Content Management System.

Figure 2.

Entity Relationship (ER) diagram depicting the relationships between the tables present in the mySQL database implemented for the NOPdb3.0 Content Management System.

Version 3.0 of the NOPdb is an entirely new implementation using a fully relational design with major improvements over previous versions and additional functionality. The newly created database holds data of higher granularity, storing data at the peptide level as opposed to collated data on proteins. This higher granularity also means that results from new experiments can be directly uploaded to the database without prior processing, as the direct output from MS-based proteomics analyses is peptide data. The application has the ability to interpret data and therefore aggregate it to provide metadata for proteins on a usable, graphical interface. The structure of the application has been designed using the model view controller design pattern (8), thus meaning that the functionality is separated from the overall look and feel of the application to ensure a more customisable solution. All communication between the database and application has been implemented to pass through the custom made API (9). Furthermore, in this new version 3.0 application, the graphical user interface to the database is able to create data pages ‘on the fly’ using the custom API rather than serving static data pages, as in previous versions. This API not only acts as a security blanket around the database, it also provides the ability for users to create their own websites and/or applications that represent the data being stored in the proteomics database. External users can make use of the API through the REST (Representational State Transfer) (10) approach. Hence, external programmers can retrieve content in XML (Extensible Markup Language) format, from the database, by accessing well-documented Uniform Resource Locators (URLs).

The application also facilitates mining of stored data, with data being stored in a relational structure that is well documented. Thus tools can be built to search, analyse, read and understand the data. This mining capability is evident within the application, with the database being searchable by multiple parameters, including gene names, amino acid or nucleotide sequences, sequence motifs, or by limiting the range for isoelectric points and/or molecular weights. The database is also searchable by Interpro motif numbers (database of protein families, domains and functional sites) (4–6) and by GO terms (describe gene products in terms of their associated biological processes, cellular components and molecular functions in a species-independent manner) (7). Furthermore, the NOPdb3.0 application uses the API to create dynamically generated graphs, allowing the users to visualise the data produced from experiments and enabling cross analysis between experiments.

Increased security was a core focus of this development. The application itself is designed with three levels of access, to facilitate management and to prevent unauthorised use of the system. Users are provided with different levels of access according to their needs, which are seamlessly enforced by the application. This security ensures that the data remain accurate and the quality of the data is not compromised. Furthermore, this application creates a platform for the Lamond group to share their data with the wider cell biology community.

DATABASE CONTENT

The database has been populated with different sets of experiments, performed in the Lamond laboratory, that identify proteins in purified preparations of human nucleoli. This new NOPdb3.0 now contains over 4500 proteins identified in different human cells lines. The increased coverage of the human nucleolus proteome is illustrated by the fact that NOPdb3.0 now includes over 80% of ribosomal proteins, as opposed to the ∼28% described in NOPdb version 2.0. We estimate that NOPdb3.0 contains over 80% of the main human nucleolus proteins. The proteins in the database will be regularly updated as more experiments are performed in the Lamond laboratory.

FUNDING

This work was supported by a Wellcome Trust Programme Grant (073980/Z/03/Z) and by an interdisciplinary RASOR (Radical Solutions for Researching the Proteome) initiative, which is supported by the Biotechnology and Biological Sciences Research Council, Engineering and Physical Sciences Research Council, Scottish Higher Education Funding Council and Medical Research Council. A.I.L. is a Wellcome Trust Principal Research Fellow. Caledonian Research Foundation Fellowship (to F.M.B.). BBSRC PhD studentship (to Y.A.). Funding for open access charge: Wellcome Trust.

Conflict of interest statement. None declared.

ACKNOWLEDGEMENTS

We would like to thank Drs Douglas Lamont and Kenneth Beattie of the Fingerprints Proteomics Facility at the University of Dundee (http://proteomics.lifesci.dundee.ac.uk/) for technical assistance.

REFERENCES

The multifunctional nucleolus

Nat. Rev. Mol. Cell. Biol.

2007

, vol.

(pg.

574

585

)

NOPdb: Nucleolar Proteome Database

Nucleic Acids Res.

2006

, vol.

(pg.

D218

D220

)

Nucleolar proteome dynamics

Nature

2005

, vol.

433

(pg.

)

et al.

The InterPro Database, 2003 brings increased coverage and new features

Nucleic Acids Res.

2003

, vol.

(pg.

315

318

)

et al.

The Pfam protein families database

Nucleic Acids Res.

2004

, vol.

(pg.

D138

D141

)

SMART 4.0: towards genomic data integration

Nucleic Acids Res.

2004

, vol.

(pg.

D142

D144

)

et al.

Gene ontology: tool for the unification of biology. The Gene Ontology Consortium

Nat. Genet.

2000

, vol.

(pg.

)

Architectural design of modern web applications

Found. Comput. Decision Sci.

2005

, vol.

(pg.

)

API: design matters

ACM Queue

2007

, vol.

(pg.

)

Principled design of the modern web architecture

ACM Trans. Internet Technol.

2002

, vol.

(pg.

115

150

)

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

I agree to the terms and conditions. You must accept the terms and conditions.

Submit a comment

Name

Affiliations

Comment title

Comment

You have entered an invalid code

Thank you for submitting a comment on this article. Your comment will be reviewed and published at the journal's discretion. Please check for further notifications by email.

Citations

Views

Altmetric

Metrics

Total Views 2,899

2,200 Pageviews

699 PDF Downloads

Since 11/1/2016

Month:	Total Views:
November 2016	1
December 2016	4
January 2017	12
February 2017	23
March 2017	23
April 2017	27
May 2017	17
June 2017	20
July 2017	21
August 2017	15
September 2017	35
October 2017	27
November 2017	5
December 2017	17
January 2018	39
February 2018	31
March 2018	38
April 2018	35
May 2018	54
June 2018	36
July 2018	43
August 2018	30
September 2018	12
October 2018	26
November 2018	65
December 2018	35
January 2019	38
February 2019	47
March 2019	49
April 2019	55
May 2019	40
June 2019	40
July 2019	54
August 2019	49
September 2019	34
October 2019	58
November 2019	21
December 2019	29
January 2020	29
February 2020	21
March 2020	20
April 2020	26
May 2020	19
June 2020	29
July 2020	29
August 2020	37
September 2020	11
October 2020	33
November 2020	23
December 2020	23
January 2021	31
February 2021	22
March 2021	45
April 2021	25
May 2021	39
June 2021	15
July 2021	19
August 2021	14
September 2021	36
October 2021	26
November 2021	22
December 2021	23
January 2022	15
February 2022	31
March 2022	45
April 2022	21
May 2022	32
June 2022	25
July 2022	13
August 2022	44
September 2022	23
October 2022	32
November 2022	24
December 2022	47
January 2023	25
February 2023	33
March 2023	30
April 2023	28
May 2023	23
June 2023	15
July 2023	22
August 2023	25
September 2023	26
October 2023	35
November 2023	21
December 2023	50
January 2024	25
February 2024	42
March 2024	43
April 2024	34
May 2024	40
June 2024	30
July 2024	40
August 2024	40
September 2024	57
October 2024	41

Citations

224 Web of Science

NOPdb: Nucleolar Proteome Database—2008 update (original) (raw)

Cite

Abstract

INTRODUCTION

DATABASE ACCESS

DATABASE IMPLEMENTATION

DATABASE CONTENT

FUNDING

ACKNOWLEDGEMENTS

REFERENCES

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Cited

NOPdb: Nucleolar Proteome Database—2008 update (original) (raw)

Cite

Abstract

INTRODUCTION

DATABASE ACCESS

DATABASE IMPLEMENTATION

DATABASE CONTENT

FUNDING

ACKNOWLEDGEMENTS

REFERENCES

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited