NOPdb: Nucleolar Proteome Database—2008 update (original) (raw)

Journal Article

,

1Wellcome Trust Centre for Gene Regulation & Expression and 2School of Computing, University of Dundee, Dundee DD1 5EH, UK

Search for other works by this author on:

,

1Wellcome Trust Centre for Gene Regulation & Expression and 2School of Computing, University of Dundee, Dundee DD1 5EH, UK

Search for other works by this author on:

,

1Wellcome Trust Centre for Gene Regulation & Expression and 2School of Computing, University of Dundee, Dundee DD1 5EH, UK

Search for other works by this author on:

,

1Wellcome Trust Centre for Gene Regulation & Expression and 2School of Computing, University of Dundee, Dundee DD1 5EH, UK

Search for other works by this author on:

1Wellcome Trust Centre for Gene Regulation & Expression and 2School of Computing, University of Dundee, Dundee DD1 5EH, UK

Search for other works by this author on:

Received:

12 September 2008

Revision received:

10 October 2008

Accepted:

10 October 2008

Published:

04 November 2008

Cite

Yasmeen Ahmad, François-Michel Boisvert, Peter Gregor, Andy Cobley, Angus I. Lamond, NOPdb: Nucleolar Proteome Database—2008 update, Nucleic Acids Research, Volume 37, Issue suppl_1, 1 January 2009, Pages D181–D184, https://doi.org/10.1093/nar/gkn804
Close

Navbar Search Filter Mobile Enter search term Search

Abstract

An experimental data handling system has been created as an update to the previous Nucleolar Proteome Database (NOPdb3.0: http://www.lamondlab.com/NOPdb3.0/). This updated system is able to manage large data sets identified by multiple mass spectrometry and has been used to analyse highly purified preparations of human nucleoli from different cell lines. The newly created application includes a dynamic relational database, which is kept up to date by laboratory staff. The data are further annotated with information from specific external sources on the web, including the IPI and Gene Ontology databases. In addition, an Application Programming Interface provides external users with a portal to link into the nucleolar proteome database and hence, gain access to continually updated results. From the initial ∼700 human proteins identified in the previous iteration of the NOPdb, we have now identified over 50 000 peptides contained in over 4500 human proteins from purified nucleoli, providing enhanced coverage of the nucleolar proteome.

INTRODUCTION

The nucleolus is a highly conserved nuclear organelle whose main function is to coordinate the synthesis and assembly of ribosome subunits (1). Previously we described a Nucleolar Proteome Database (NOPdb2.0: http://www.lamondlab.com/NOPdb) that archived data on >700 proteins that were identified by multiple mass spectrometry (MS) analyses from highly purified preparations of human nucleoli (2). Each protein entry was annotated with information about its corresponding gene, its domain structures and relevant protein homologues across species, as well as documenting its MS identification history, including all the peptides sequenced by tandem MS/MS. Moreover, data showing the quantitative changes in the relative levels of approximately 500 nucleolar proteins were compared at different time points upon transcriptional inhibition (3).

The data presented by the previous NOPdb, version 2.0, was held in a flat file database. Due to the aggregated nature of the data, results from individual experiments could not be extracted. The peptide data for a single protein were merged within this database rather than stored separately. The client interface to this database consisted of Perl CGI scripts. These scripts were able to extract the relevant data from the flat file database to create static html pages. After running the scripts, a page was created on the server for each protein. The html pages were then made available to the global community via the internet. Each time data were updated in the flat files, the Perl scripts had to be run again in order to reproduce the static html pages. This process of having to reproduce the static html protein pages after each database update was highly inefficient and time consuming. A more efficient approach is to produce dynamic html pages upon user request. Furthermore, the capabilities of the version 2.0 NOPdb database were limited with respect to security, ease of use, accessibility, maintainability and expandability. For example, a number of security concerns arose regarding the Perl scripts, which with limited documentation, proved very difficult to resolve.

The new version of the NOPdb3.0 (http://www.lamondlab.com/NOPdb3.0/) consists of a unique, secure, extendable content management system, holding advanced nucleolar proteomics data. The created application includes a dynamic relational database, which is kept up to date by members of the Lamond group. It also allows the query of protein data hosted within the database by external users, either using the custom built interface provided by the Lamond group, or by building custom web tools that access data via the Application Programming Interface (API). In addition to the dynamic interfaces provided by the new content management system, the data included in the nucleolar proteome are also dynamically updated with proteins identified from several different cell lines, using various instruments by members of the laboratory. From the initial ∼700 proteins identified in the previous iteration of the NOPdb, we have now identified over 50 000 peptides contained in over 4500 human proteins from purified nucleoli, providing significantly enhanced coverage of the nucleolar proteome.

DATABASE ACCESS

We have established the new version of the Nucleolar Proteome Database (NOPdb3.0), which archives all the human nucleolar proteins identified to date by the Lamond group and their collaborators using MS analyses (1–3). This current version 3.0 of the database is available at http://www.lamondlab.com/NOPdb3.0/ and is searchable either by protein name, protein sequence, motif (4–6), Gene Ontology (GO) (7) terms or by setting the range of the predicted isoelectric point and/or molecular weight (Figure 1). To date, NOPdb3.0 archives over 4500 human nucleolar proteins verified by multiple MS analyses in different cell lines. The NOPdb3.0 provides information on multiple parameters, including protein name, accession number, gene symbol, gene name, sequence, molecular weight, isoelectric point (pI), peptides identified, experiments in which the protein was identified, motifs and GO annotation. The previous version of the database (2) will still be available through our website at http://www.lamondlab.com/NOPdb/.

Snapshots of the NOPdb3.0 (http://www.lamondlab.com/NOPdb3.0/). For illustration, the database was searched to identify a Protein Phosphatase 1 (PP1) isoform and here we show an overview page for this protein documenting its sequence, peptides identified, etc.

Figure 1.

Snapshots of the NOPdb3.0 (http://www.lamondlab.com/NOPdb3.0/). For illustration, the database was searched to identify a Protein Phosphatase 1 (PP1) isoform and here we show an overview page for this protein documenting its sequence, peptides identified, etc.

DATABASE IMPLEMENTATION

The new NOPdb3.0 application consists of a multi-tier architecture, where the data storage, business logic and client interface are separate components. The data storage is implemented via a relational mySQL database. The database is structured (Figure 2) to allow easy extendibility and maintenance in the future. In order to extract useful data, the business logic employs complex SQL queries. The purpose of the business logic layer is to act as an interface between the client-side application and database. The business logic and client interface can both reside on any Apache web server capable of serving PHP classes and the client interface, which is built in Adobe Flex. Adobe Flex was chosen as it allows Rich Internet Applications (RIAs) to be prototyped and developed rapidly, with the end product running across a wide range of client browsers.

Entity Relationship (ER) diagram depicting the relationships between the tables present in the mySQL database implemented for the NOPdb3.0 Content Management System.

Figure 2.

Entity Relationship (ER) diagram depicting the relationships between the tables present in the mySQL database implemented for the NOPdb3.0 Content Management System.

Version 3.0 of the NOPdb is an entirely new implementation using a fully relational design with major improvements over previous versions and additional functionality. The newly created database holds data of higher granularity, storing data at the peptide level as opposed to collated data on proteins. This higher granularity also means that results from new experiments can be directly uploaded to the database without prior processing, as the direct output from MS-based proteomics analyses is peptide data. The application has the ability to interpret data and therefore aggregate it to provide metadata for proteins on a usable, graphical interface. The structure of the application has been designed using the model view controller design pattern (8), thus meaning that the functionality is separated from the overall look and feel of the application to ensure a more customisable solution. All communication between the database and application has been implemented to pass through the custom made API (9). Furthermore, in this new version 3.0 application, the graphical user interface to the database is able to create data pages ‘on the fly’ using the custom API rather than serving static data pages, as in previous versions. This API not only acts as a security blanket around the database, it also provides the ability for users to create their own websites and/or applications that represent the data being stored in the proteomics database. External users can make use of the API through the REST (Representational State Transfer) (10) approach. Hence, external programmers can retrieve content in XML (Extensible Markup Language) format, from the database, by accessing well-documented Uniform Resource Locators (URLs).

The application also facilitates mining of stored data, with data being stored in a relational structure that is well documented. Thus tools can be built to search, analyse, read and understand the data. This mining capability is evident within the application, with the database being searchable by multiple parameters, including gene names, amino acid or nucleotide sequences, sequence motifs, or by limiting the range for isoelectric points and/or molecular weights. The database is also searchable by Interpro motif numbers (database of protein families, domains and functional sites) (4–6) and by GO terms (describe gene products in terms of their associated biological processes, cellular components and molecular functions in a species-independent manner) (7). Furthermore, the NOPdb3.0 application uses the API to create dynamically generated graphs, allowing the users to visualise the data produced from experiments and enabling cross analysis between experiments.

Increased security was a core focus of this development. The application itself is designed with three levels of access, to facilitate management and to prevent unauthorised use of the system. Users are provided with different levels of access according to their needs, which are seamlessly enforced by the application. This security ensures that the data remain accurate and the quality of the data is not compromised. Furthermore, this application creates a platform for the Lamond group to share their data with the wider cell biology community.

DATABASE CONTENT

The database has been populated with different sets of experiments, performed in the Lamond laboratory, that identify proteins in purified preparations of human nucleoli. This new NOPdb3.0 now contains over 4500 proteins identified in different human cells lines. The increased coverage of the human nucleolus proteome is illustrated by the fact that NOPdb3.0 now includes over 80% of ribosomal proteins, as opposed to the ∼28% described in NOPdb version 2.0. We estimate that NOPdb3.0 contains over 80% of the main human nucleolus proteins. The proteins in the database will be regularly updated as more experiments are performed in the Lamond laboratory.

FUNDING

This work was supported by a Wellcome Trust Programme Grant (073980/Z/03/Z) and by an interdisciplinary RASOR (Radical Solutions for Researching the Proteome) initiative, which is supported by the Biotechnology and Biological Sciences Research Council, Engineering and Physical Sciences Research Council, Scottish Higher Education Funding Council and Medical Research Council. A.I.L. is a Wellcome Trust Principal Research Fellow. Caledonian Research Foundation Fellowship (to F.M.B.). BBSRC PhD studentship (to Y.A.). Funding for open access charge: Wellcome Trust.

Conflict of interest statement. None declared.

ACKNOWLEDGEMENTS

We would like to thank Drs Douglas Lamont and Kenneth Beattie of the Fingerprints Proteomics Facility at the University of Dundee (http://proteomics.lifesci.dundee.ac.uk/) for technical assistance.

REFERENCES

1

The multifunctional nucleolus

,

Nat. Rev. Mol. Cell. Biol.

,

2007

, vol.

8

(pg.

574

-

585

)

2

NOPdb: Nucleolar Proteome Database

,

Nucleic Acids Res.

,

2006

, vol.

34

(pg.

D218

-

D220

)

3

Nucleolar proteome dynamics

,

Nature

,

2005

, vol.

433

(pg.

77

-

83

)

4

et al.

The InterPro Database, 2003 brings increased coverage and new features

,

Nucleic Acids Res.

,

2003

, vol.

31

(pg.

315

-

318

)

5

et al.

The Pfam protein families database

,

Nucleic Acids Res.

,

2004

, vol.

32

(pg.

D138

-

D141

)

6

SMART 4.0: towards genomic data integration

,

Nucleic Acids Res.

,

2004

, vol.

32

(pg.

D142

-

D144

)

7

et al.

Gene ontology: tool for the unification of biology. The Gene Ontology Consortium

,

Nat. Genet.

,

2000

, vol.

25

(pg.

25

-

29

)

8

Architectural design of modern web applications

,

Found. Comput. Decision Sci.

,

2005

, vol.

30

(pg.

49

-

60

)

9

API: design matters

,

ACM Queue

,

2007

, vol.

5

(pg.

4

-

14

)

10

Principled design of the modern web architecture

,

ACM Trans. Internet Technol.

,

2002

, vol.

2

(pg.

115

-

150

)

© 2008 The Author(s)

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

I agree to the terms and conditions. You must accept the terms and conditions.

Submit a comment

Name

Affiliations

Comment title

Comment

You have entered an invalid code

Thank you for submitting a comment on this article. Your comment will be reviewed and published at the journal's discretion. Please check for further notifications by email.

Citations

Views

Altmetric

Metrics

Total Views 2,899

2,200 Pageviews

699 PDF Downloads

Since 11/1/2016

Month: Total Views:
November 2016 1
December 2016 4
January 2017 12
February 2017 23
March 2017 23
April 2017 27
May 2017 17
June 2017 20
July 2017 21
August 2017 15
September 2017 35
October 2017 27
November 2017 5
December 2017 17
January 2018 39
February 2018 31
March 2018 38
April 2018 35
May 2018 54
June 2018 36
July 2018 43
August 2018 30
September 2018 12
October 2018 26
November 2018 65
December 2018 35
January 2019 38
February 2019 47
March 2019 49
April 2019 55
May 2019 40
June 2019 40
July 2019 54
August 2019 49
September 2019 34
October 2019 58
November 2019 21
December 2019 29
January 2020 29
February 2020 21
March 2020 20
April 2020 26
May 2020 19
June 2020 29
July 2020 29
August 2020 37
September 2020 11
October 2020 33
November 2020 23
December 2020 23
January 2021 31
February 2021 22
March 2021 45
April 2021 25
May 2021 39
June 2021 15
July 2021 19
August 2021 14
September 2021 36
October 2021 26
November 2021 22
December 2021 23
January 2022 15
February 2022 31
March 2022 45
April 2022 21
May 2022 32
June 2022 25
July 2022 13
August 2022 44
September 2022 23
October 2022 32
November 2022 24
December 2022 47
January 2023 25
February 2023 33
March 2023 30
April 2023 28
May 2023 23
June 2023 15
July 2023 22
August 2023 25
September 2023 26
October 2023 35
November 2023 21
December 2023 50
January 2024 25
February 2024 42
March 2024 43
April 2024 34
May 2024 40
June 2024 30
July 2024 40
August 2024 40
September 2024 57
October 2024 41

Citations

224 Web of Science

×

Email alerts

Citing articles via

More from Oxford Academic