The ProteoRed MIAPE web toolkit: A User-friendly Framework to Connect and Share Proteomics Standards (original) (raw)

Semi-automatic tool to describe, store and compare proteomics experiments based on MIAPE compliant reports

PROTEOMICS, 2010

The Human Proteome Organization's Proteomics Standards Initiative aims to develop new standards for data representation and exchange. The Proteomics Standards Initiative has defined the Minimum Information About a Proteomics Experiment (MIAPE) guidelines that specify the information that should be reported with a published experiment. With the aim of promoting the implementation of standard reporting guidelines, we have developed a web tool that helps to generate and store MIAPE compliant reports describing gel electrophoresis and MS-based experiments. The tool can be used in the reviewing phase of the proteomics publication process and can facilitate data interpretation through the comparison of related studies.

Further Steps in Standardisation Report of the Second Annual Proteomics Standards Initiative Spring Workshop (Siena, Italy 17–20th April 2005)

…, 2005

The spring workshop of the HUPO-PSI convened in Siena to further progress the data standards which are already making an impact on data exchange and deposition in the field of proteomics. Separate work groups pushed forward existing XML standards for the exchange of Molecular Interaction data (PSI-MI, MIF) and Mass Spectrometry data (PSI-MS, mzData) whilst significant progress was made on PSI-MS' mzIdent, which will allow the capture of data from analytical tools such as peak list search engines. A new focus for PSI (GPS, gel electrophoresis) was explored; as was the need for a common representation of protein modifications by all workers in the field of proteomics and beyond. All these efforts are contextualised by the work of the General Proteomics Standards workgroup; which in addition to the MIAPE reporting guidelines, is continually evolving an object model (PSI-OM) from which will be derived the general standard XML format for exchanging data between researchers, and for submission to repositories or journals.

The Proteomics Standards Initiative: Fifteen Years of Progress and Future Work

Journal of proteome research, 2017

The Proteomics Standards Initiative (PSI) of the Human Proteome Organization (HUPO) has now been developing and promoting open community standards and software tools in the field of proteomics for 15 years. Under the guidance of the chair, co-chairs, and other leadership positions, the PSI working groups are tasked with the development and maintenance of community standards via special workshops and ongoing work. Among the existing, ratified standards, the PSI working groups continue to update PSI-MI XML, MITAB, mzML, mzIdentML, mzQuantML, mzTab, and the MIAPE (Minimum Information About a Proteomics Experiment) guidelines with the advance of new technologies and techniques. Further, new standards are currently either in the final stages of completion (proBed and proBAM for proteogenomics results, as well as PEFF) or in early stages of design (a spectral library standard format, a universal spectrum identifier, the qcML quality control format, and the Protein Expression Interface (PR...

ProteomeXchange provides globally coordinated proteomics data submission and dissemination

Nature Biotechnology, 2014

There is a growing trend towards public dissemination of proteomics data, which is facilitating the assessment, reuse, comparative analyses and extraction of new findings from published data 1, 2 . This process has been mainly driven by journal publication guidelines and funding agencies. However, there is a need for better integration of public repositories and coordinated sharing of all the pieces of information needed to represent a full mass spectrometry (MS)-based proteomics experiment. Your July 2009 editorial "Credit where credit is overdue" 3 exposed the situation in the proteomics field, where full data disclosure is still not common practise. Olsen and Mann 4 identified different levels of information in the typical experiment, starting from raw data and going through peptide identification and quantification, protein identifications and ratios and the resulting biological conclusions. All of these levels should be captured and properly annotated in public databases, using the existing MS proteomics repositories for the MS data (raw data, identification and quantification results) and metadata, whereas the resulting biological information should be integrated in protein knowledgebases, such as UniProt 5 . A recent editorial in Nature Methods 6 again highlighted the need for a stable repository for raw MS proteomics data. In this Correspondence, we report on the first implementation of the ProteomeXchange consortium, an integrated framework for submission and dissemination of MS-based proteomics data.

A guide for integration of proteomic data standards into laboratory workflows

PROTEOMICS, 2013

The development of the HUPO-Proteomics Standards Initiative standard data formats and Minimum Information About a Proteomics Experiment guidelines facilitate coordination within the scientific community. The data standards provide a framework to exchange and share data regardless of the source instrument or software. Nevertheless there remains a view that Proteomics Standards Initiative standards are challenging to use and integrate into routine laboratory pipelines. In this article, we review the tools available for integrating the different data standards and building compliant software. These tools are focused on a range of different data types and support different scenarios, intended for software developers or end users, allowing the standards to be used in a straightforward manner.

ms-data-core-api: An open-source, metadata-oriented library for computational proteomics

Bioinformatics (Oxford, England), 2015

The ms-data-core-api is a free, open-source library for developing computational proteomics tools and pipelines. The Application Program Interface, written in Java, enables rapid tool creation by providing a robust, pluggable programming interface and common data model. The data model is based on controlled vocabularies/ontologies and captures the whole range of data types included in common proteomics experimental workflows, going from spectra to identifications to quantitative results. The library contains readers for three of the most used Proteomics Standards Initiative standard file formats: mzML, mzIdentML, and mzTab. In addition to mzML, it also supports other common mass spectra formats: dta, ms2, mgf, pkl, apl (text-based), mzXML and mzData (XML-based). Also, it can be used to read PRIDE XML, the original format used by the PRIDE database, one of the world-leading proteomics resources. Finally, we present a set of algorithms and tools whose implementation illustrates the si...

Making proteomics data accessible and reusable: Current state of proteomics databases and repositories

Proteomics, 2014

Compared to other data intensive disciplines such as genomics, public deposition and storage of mass spectrometry (MS)-based proteomics data is still less developed due to, among other reasons, the inherent complexity of the data and the variety of data types and experimental workflows. In order to address this need several public repositories for MS proteomics experiments have been developed, each with different purposes in mind. The most established resources are the Global Proteome Machine Database (GPMDB), PeptideAtlas and the PRoteomics IDEntifications (PRIDE) database. Additionally, there are other useful (in many cases recently developed) resources such as ProteomicsDB, MassIVE, Chorus, MaxQB, PASSEL, MOPED and the Human Proteinpedia. In addition, the ProteomeXchange consortium has been recently developed for enabling a better integration of public repositories and the coordinated sharing of proteomics information, maximizing its benefit to the scientific community. Here, we ...

A systematic approach to modeling, capturing, and disseminating proteomics experimental data

Nature Biotechnology, 2003

Both the generation and the analysis of proteome data are becoming increasingly widespread, and the field of proteomics is moving incrementally toward high-throughput approaches. Techniques are also increasing in complexity as the relevant technologies evolve. A standard representation of both the methods used and the data generated in proteomics experiments, analogous to that of the MIAME (minimum information about a microarray experiment) guidelines for transcriptomics, and the associated MAGE (microarray gene expression) object model and XML (extensible markup language) implementation, has yet to emerge. This hinders the handling, exchange, and dissemination of proteomics data. Here, we present a UML (unified modeling language) approach to proteomics experimental data, describe XML and SQL (structured query language) implementations of that model, and discuss capture, storage, and dissemination strategies. These make explicit what data might be most usefully captured about proteomics experiments and provide complementary routes toward the implementation of a proteome repository.