How to increase the credibility and ease the investigation of biosignatures in TOF-SIMS mass spectra? (original) (raw)

Validated MALDI-TOF/TOF mass spectra for protein standards

Journal of the American Society for Mass Spectrometry, 2007

A current focus of proteomics research is the establishment of acceptable confidence measures in the assignment of protein identifications in an unknown sample. Development of new algorithmic approaches would greatly benefit from a standard reference set of spectra for known proteins for the purpose of testing and training. Here we describe an openly available library of mass spectra generated on an ABI 4700 MALDI TOF/TOF from 246 known, individually purified and trypsin-digested protein samples. The initial full release of the Aurum Dataset includes gel images, peak lists, spectra, search result files, decoy database analysis files, FASTA file of protein sequences, manual curation, and summary pages describing protein coverage and peptides matched by MS/MS followed by decoy database analysis using Mascot, Sequest, and X!Tandem. The data are publicly available for use at

SELDI-TOF mass spectra: A view on sources of variation

Journal of Chromatography B-analytical Technologies in The Biomedical and Life Sciences, 2007

Adequate interpretation of mass spectrometry data can yield valuable biomarkers. However, spectrum interpretation is a complicated task. This paper reviews the various factors that determine a sample's spectrum and demonstrates the role of these factors in the interpretation process. We derive a simulation model that adequately predicts the expected spectrum based on known sample content and, in the reverse mode, obtain an analysis model that adequately fits an observed spectrum based on the hypothesized sources of variation.

MSLIB — a versatile tool for handling and interpreting mass spectral data

TrAC Trends in Analytical Chemistry, 1994

An MS-DOS-based system for the handling and processing of mass spectral data is introduced. MSLIB provides a convenient graphical user interface and allows the administration of mass spectral data as well as related substance specific information, including chemical structures. MSLIB provides tools both for importing and editing the data, and for searching in the databases, and includes a spectral search and a structure similarity search.

Methods to extract molecular and bulk chemical information from series of complex mass spectra with limited mass resolution

The resolution of mass spectrometers is often insufficient to conclusively identify all peaks that may be present in recorded spectra. Here, we present new methods to extract consistent molecular and bulk level chemical information by constrained fitting of series of complex organic mass spectra with multiple overlapping peaks. Possible individual peaks in a group of overlapping peaks are identified by both defining a chemical space and by free peak fitting. If simply all possible formulas from the chemical space would be used to fit each peak, the result would not be well constrained. The free peak fitting algorithm provides information about likely peak locations. A new algorithm then reconciles the results of both methods and produces a final peak list for use in subsequent fitting, while using all available experimental constraints. Comparison to ultra-high resolution data suggests that the real peak density is substantially higher than can be resolved with the instrument resolution. Bulk chemical properties such as carbon number (nC) and carbon oxidation state (OS C) can be calculated from the fit results. For mixtures of compounds dominated by C, H, O and N, bulk properties can be reliably extracted, even though some formula assignments may remain uncertain. This ability to retrieve correct bulk parameters even if not all assigned formulas are correct originates from the relationship between mass defects of individual peaks and the chemical parameters under our CHON composition assumptions. Retrieving consistent bulk parameters across series of many mass spectra is essential for extracting time trends, e.g. for field measurements taking place over several weeks. We illustrate the fitting method using a sample data set from a chemical ionization mass spectrometer with a resolution of approximately 4000 (M/dM), operated using acetate reagent ions. Spectral simulation experiments validate the analysis method by showing good agreement of intensities for many specific ions, as well as for bulk chemical parameters. An alternative method to directly extract bulk chemical information from the raw spectra without the need of any peak assignment or peak fitting is also introduced, which shows good agreement with the peak fitting results. The latter method can be applied very rapidly without the need for complex analysis procedures, e.g. as a quick online diagnostic during data acquisition.

Evaluation of the sensitivity of the ‘Wiley registry of tandem mass spectral data, MSforID’ with MS/MS data of the ‘NIST/NIH/EPA mass spectral library’

Journal of Mass Spectrometry, 2013

Tandem mass spectral libraries are versatile tools for small molecular identification finding application in forensic science, doping control, drug monitoring, food and environmental analysis, as well as metabolomics. Two important libraries are the 'Wiley Registry of Tandem Mass Spectral Data, MSforID' (Wiley Registry MSMS) and the collection of MS/MS spectra part of the 2011 edition of the 'NIST/NIH/EPA Mass Spectral Library' (NIST 11 MSMS). Herein, the sensitivity and robustness of the Wiley Registry MSMS were evaluated using spectra extracted from the NIST 11 MSMS library. The sample set was found to be heterogeneous in terms of mass spectral resolution, type of CID, as well as applied collision energies. Nevertheless, sensitive compound identification with a true positive identification rate ≥95% was possible using either the MSforID Search program or the NIST MS Search program 2.0g for matching. To rate the performance of the Wiley Registry MSMS, cross-validation experiments were repeated using subcollections of NIST 11 MSMS as reference library and spectra extracted from the Wiley Registry MSMS as positive controls. Unexpectedly, with both search algorithms tested, correct results were obtained in less than 88% of cases. We examined possible causes for the results of the cross validation study. The large number of precursor ions represented by a single tandem mass spectrum only was identified as the basic cause for the comparably lower sensitivity of the NIST library.

Interpretation of Mass Spectra

Mass Spectrometry, 2017

The chapter includes an introduction to the main ionisation techniques in mass spectrometry and the way the resulting fragments can be analysed. First, the fundamental notions of mass spectrometry are explained, so that the reader can easily cover this chapter (graphs, main pick, molecular ion, illogical pick, nitrogen rule, etc.). Isotopic percentage and nominal mass calculation are also explained along with fragmentation mechanism. A paragraph emphasises the ionisation energy issues, the basics of ionisation voltage, the developing potential and the energy balance. A frame time of the main theoretical milestones in both theory and experimental mass spectrometry is highlighted here. In the second part of the chapter, the molecular fragmentation for alkanes, iso-alkanes, cycloalkanes, halogen, alcohols, phenols, ethers, carbonyl compounds, carboxylic acids and functional derivatives, nitrogen compounds (amines, nitro compounds), sulphur compounds, heterocycles and biomolecules (amino acids, steroids, triglycerides) is explained. Fragmentation schemes are followed by the simplified spectra, which help the understanding of such complex phenomena. At the end of the chapter, acquisition of mass spectrum is discussed. The chapter presented here is an introduction to mass spectrometry, which, we think, helps the understanding of the mechanism of fragmentation corroborating spectral data and molecular structures.

Processing and classification of protein mass spectra

Mass Spectrometry Reviews, 2006

Among the many applications of mass spectrometry, biomarker pattern discovery from protein mass spectra has aroused considerable interest in the past few years. While research efforts have raised hopes of early and less invasive diagnosis, they have also brought to light the many issues to be tackled before mass-spectra-based proteomic patterns become routine clinical tools. Known issues cover the entire pipeline leading from sample collection through mass spectrometry analytics to biomarker pattern extraction, validation, and interpretation. This study focuses on the data-analytical phase, which takes as input mass spectra of biological specimens and discovers patterns of peak masses and intensities that discriminate between different pathological states. We survey current work and investigate computational issues concerning the different stages of the knowledge discovery process: exploratory analysis, quality control, and diverse transforms of mass spectra, followed by further dimensionality reduction, classification, and model evaluation. We conclude after a brief discussion of the critical biomedical task of analyzing discovered discriminatory patterns to identify their component proteins as well as interpret and validate their biological implications. © 2006 Wiley Periodicals, Inc., Mass Spec Rev 25:409–449, 2006

Interpreting Mass Spectra Differing from Their Peptide Models by Several Modifications

Background: In proteomics, mass spectra representing peptides carrying multiple unknown modifications are particularly difficult to interpret. This issue results in a large number of unidentified spectra.Methods: We developed SpecGlob, a dynamic programming algorithm that aligns pairs of spectra – each pair given by a Peptide-Spectrum Match (PSM) – provided by any Open Modification Search (OMS) method. For each PSM, SpecGlob computes the best alignment according to a given score system, interpreting the mass delta within the PSM as one or several unspecified modification(s). All the alignments are provided in a file, using a specific syntax. These alignments are then post-processed by an additional algorithm, which aims at interpreting the detected modifications.Results: Using a large collection of theoretical spectra generated from the human proteome, we demonstrate that running SpecGlob as a post-analysis of an OMS method can significantly increase the number of correctly interpre...

Similarity among Tandem Mass Spectra from Proteomic Experiments: Detection, Significance, and Utility

Analytical Chemistry, 2003

Liquid chromatography paired with tandem mass spectrometry is a standard technique for identifying peptides from complex protein mixtures. Most fragment ion spectra acquired by this technique are unique, but some are repeated. Similarities among the spectra from 1D and 2D liquid chromatography experiments were calculated by the dot product algorithm. Similar spectra were grouped, and the degree of duplication was calculated for each sample. In 1D liquid chromatography data from 1D gel bands, 18% of the fragment ion spectra were duplicates. A sixcycle 2D liquid chromatographic separation of more than 200 proteins produced 28% duplicate spectra. A rat hippocampal homogenate analyzed by a 12-cycle 2D liquid chromatographic separation contained 25% duplicate spectra. Removal of these duplicate spectra, however, resulted in fewer peptides being successfully identified by SEQUEST. We propose a modification for peptide identification algorithms that would improve their performance and accuracy by explicitly recognizing and making use of spectral similarity.