Proposed minimum reporting standards for chemical analysis (original) (raw)

Abstract

There is a general consensus that supports the need for standardized reporting of metadata or information describing large-scale metabolomics and other functional genomics data sets. Reporting of standard metadata provides a biological and empirical context for the data, facilitates experimental replication, and enables the re-interrogation and comparison of data by others. Accordingly, the Metabolomics Standards Initiative is building a general consensus concerning the minimum reporting standards for metabolomics experiments of which the Chemical Analysis Working Group (CAWG) is a member of this community effort. This article proposes the minimum reporting standards related to the chemical analysis aspects of metabolomics experiments including: sample preparation, experimental analysis, quality control, metabolite identification, and data pre-processing. These minimum standards currently focus mostly upon mass spectrometry and nuclear magnetic resonance spectroscopy due to the popularity of these techniques in metabolomics. However, additional input concerning other techniques is welcomed and can be provided via the CAWG on-line discussion forum at http://msi-workgroups.sourceforge.net/ or http://Msi-workgroups-feedback@lists.sourceforge.net. Further, community input related to this document can also be provided via this electronic forum.

Similar content being viewed by others

1 Introduction

The aim of the Chemical Analysis Working Group (CAWG) as part of the Metabolomics Standards Initiative (MSI) is to identify, develop and disseminate a consensus description for the best chemical analysis practices related to all aspects of metabolomics. Ideally, the proposed standards will consist of good analytical chemistry practices while providing specific provisions for metabolomic data (the main distinction being large numbers of data-sets each containing large numbers of measurements, and the need to compare them electronically and across different instrumental platforms). These practices will be aligned with those typically mandated by top quality analytical journals. The goal is not to prescribe how metabolomics experiments should be performed, but to formulate a minimum set of reporting standards that describe the experimental methods (i.e. the metadata or information describing the nature of the experiments and how they were actually executed) to maximize the utility of the data to other researchers. Consequently, there will be no attempt to restrict or dictate specific practices, but to develop consistent and appropriate descriptors to support the dissemination and re-use of metabolomic data. Such reporting standards will specify the metadata identified as necessary for complete and comprehensive reporting in a range of contexts, such as submission to academic journals and public databases. Data exchange standards will be developed to provide a transparent technical vehicle which meets or exceeds the requirements of reporting standards.

The scope of the CAWG includes sample preparation, experimental analysis, instrumental performance, method validation, metabolite identification, and data pre-processing. There is slight overlap in the sample preparation with the Biological Context Working Group and slight overlap in data pre-processing with the Data Processing Working Group. However, the scope and focus of the CAWG is upon the experimental aspects of sample processing, instrumental analysis, and commonly used data pre-processing methods which convert raw instrumental files into organized, tabulated file formats. The organized data are then used for further statistical and chemometric analysis which are the focus of the Data Processing Working Group.

The operational plan of the CAWG is to cooperatively draft a consensus document that describes a minimum core set of necessary metadata related to the chemical analyses associated with metabolomics experiments. This will be based upon community input from generalists and specialists relating to the most common technologies utilized in metabolomics. The CAWG will evaluate previous and relevant work in other specialist areas including similar work in transcriptomics and proteomics studies, and recent metabolomics standardization efforts. The group will pay careful attention to the distinction of best practice (which will evolve as the science and technology of metabolomics advances), reporting standards (which should have longer validity) and data exchange standards (which support reporting). It will work with relevant journals and editorial staff to review and advise on the practicality, acceptability, and support of standards.

The proposed CAWG standards were originally described during the NIH Metabolomics Workshop convened in August, 2005 (http://www.niddk.nih.gov/fund/other/metabolomics2005/) and are based upon significant literature (Bino 2004; Jenkins et al. 2004; Quackenbush 2004; Jenkins et al. 2005; Lindon et al. 2005; Fiehn et al. 2006, Rubtsov et al. 2007). Significant input has been provided related to mass spectrometry (MS) and nuclear magnetic resonance (NMR) based metabolomics, but the ultimate schema is aimed at all analytical approaches used in metabolomics. Input to date has been provided by a diversity of academic and commercial entities through personal communications and through the on-line discussion forum (http://msi-workgroups.sourceforge.net/).

2 Proposed minimum information for reporting chemical analysis

The following sections describe the proposed minimum information for reporting chemical analyses metadata that have been discussed to date. The proposed minimum reporting standard information is presented below as bulleted text which is augmented with numerous examples. The examples should not be viewed as required and are not meant to include an exhaustive list of all possibilities. However, the examples should help the reader better visualize the requested context of the proposed minimum information.

2.1 Proposed minimum metadata for sample preparation

Sample preparation is a vast topic which can vary dramatically for different species, tissues, cell cultures, and biofluids. However, it is fundamentally essential that sufficient information is provided about sample preparation to enable experimental reproduction as well as to provide convincing evidence of sample integrity. The initial stages of sample preparation are often generic, whereas the final stages are almost always technique-specific. Therefore, proposed minimum standards for generic sample preparation are provided here, whereas instrument specific sample preparation details are provided within the respective instrumental sections. Further, the issue of sample collection and processing is being addressed by multiple MSI working groups and thus, there is some overlap on this theme (Fiehn et al. 2007; Griffin et al. 2007; van der Werf et al. 2007). However, greater emphasis is provided here concerning the experimental aspects of the sample processing.

Generic extraction and subsequent sample handling that are typically employed for most samples (instrument specific sample processing methods are provided in the respective sections, below).

2.2 Proposed minimum metadata relative to chromatography

The majority of mass spectrometry based metabolomics methods include sample introduction via hyphenated chromatography. This is also a feature of some NMR experiments (i.e. LC/NMR) as well as other analytical devices, e.g. photodiode arrays, Coulombic arrays, etc. Thus, it is critical to define the chromatographic parameters and the following metadata are suggested.

2.3 Proposed minimum metadata relative to mass spectrometry

Mass spectrometry is a popular but complex technique used in metabolomics. Thus, it is necessary sufficient details to enable experimental replication and the following minimum reporting standards are proposed for mass spectrometry.

2.4 Proposed minimum metadata relative to nuclear magnetic resonance

NMR is a popular, but complex technique used in metabolomics. Thus, it is necessary sufficient details to enable experimental replication and the following minimum reporting standards are proposed for mass spectrometry.

2.5 Proposed minimum metadata relative to stable isotopes & flux analysis

Many researchers utilize stable isotopes and flux analysis in metabolomics research to better understand mass flow through pathways. Therefore, the following minimum reporting standards are proposed for stable isotopes and flux analysis.

2.6 Proposed minimum metadata relative to Fourier transform infrared (FT-IR) spectroscopy

FT-IR spectroscopy has been used for metabolic fingerprinting and footprinting (Ellis and Goodacre 2006). In this approach the classification of samples is based on provenance of either their biological relevance or origin and does not usually give specific metabolite information. The following minimum reporting standards are proposed for FT-IR spectroscopy.

2.7 Proposed minimum metadata relative to instrumental performance and method validation

Instrumental performance validation/qualification and method validation help ensure reliable data production and to demonstrate that a particular method used for quantitative measurement of an analyte(s) in a given biological matrix, such as plants, blood, plasma, serum, or urine, is reliable and reproducible for the intended use (Thompson et al. 2002; FDA [2001](/article/10.1007/s11306-007-0082-2#ref-CR4 "FDA. (2001). Guidance for industry. Bioanalytical method validation, http://www.fda.gov/cder/guidance/4252fnl.pdf

              .")). These quality control procedures are fundamental components of Good Laboratory Practices (GLP), Good Analytical Practices (GAP), and Good Manufacturing Practices (GMP). Although instrumental performance and method validation are not mandated, they are recommended and the following descriptions are suggested.

2.8 Proposed minimum metadata relative to data pre-processing

The scope of the CAWG data pre-processing standards focuses upon the conversion of raw instrumental files into organized/tabulated file formats. The organized data are then used for further statistical and chemometric analyses which are the focus of the Data Processing Working Group (Goodacre et al. 2007). The following minimum reporting standards are proposed for data pre-processing.

2.9 Proposed minimum metadata relative to metabolite identification

Metabolite identification is a fundamental function that converts raw data into biological context. Thus, metabolite identifications are critical to the large-scale analysis of metabolites, i.e. metabolomics, and metabolite identifications should be of significant rigor to validate the identification. While it is difficult to prescribe a minimum reporting requirement for identification, the rigor of the metabolite identifications should be aligned with acceptable practices for chemical journals (see

However, the exact basis for what constitutes a valid metabolite identification is still currently debated in the community and a consensus is still evolving.

Currently, four levels of metabolite identifications can be found in the published metabolomics literature. They include:

    1. Identified compounds (see below).
    1. Putatively annotated compounds (e.g. without chemical reference standards, based upon physicochemical properties and/or spectral similarity with public/commercial spectral libraries).
    1. Putatively characterized compound classes (e.g. based upon characteristic physicochemical properties of a chemical class of compounds, or by spectral similarity to known compounds of a chemical class).
    1. Unknown compounds—although unidentified or unclassified these metabolites can still be differentiated and quantified based upon spectral data.

Authors should clearly differentiate and report the level of identification rigor for all metabolites reported.

The majority of metabolite identifications reported are typically non-novel as they have been previously characterized, identified, and reported at a rigorous level in the literature. Thus, non-novel metabolites not being identified for the first time are often identified based upon the co-characterization with authentic samples. However, it is generally believed that a single chemical shift, m/z value, or other singular chemical parameter is insufficient for non-novel metabolite identification. Thus, the following minimum standards for level 1, non-novel metabolite identification are proposed.

2.9.1 Nomenclature for non-novel metabolites

The standard for compound nomenclature is provided by the International Union of Pure and Applied Chemistry (IUPAC, http://www.chem.qmul.ac.uk/iupac/). However, these rules typically result in very complex and lengthy names. As a result, IUPAC names are traditionally replaced with shorter more common names, e.g. rutin as compared to 2-(3,4-dihydroxyphenyl)-5,7-dihydroxy-3-[(2S,3R,4S,5S,6R)-3,4,5-trihydroxy-6-[[(2R,3R,4R,5R,6S)-3,4,5-trihydroxy-6-methyl-oxan-2-yl]oxymethyl]oxan-2-yl]oxy-chromen-4-one. Compounds can also be referenced by numerical identifiers such as:

Generally, CAS numbers are less favored due to the proprietary nature of these numbers, whereas CID, SMILES, and INCHI codes are more preferred. It is the CAWG current opinion that INCHI codes offer a favorable format for data exchange and database communication. Thus, it is suggested that authors report a minimum of one chemical name (IUPAC or common) and one structural code for all identified metabolites for publication.

2.9.2 Novel metabolite identifications

Metabolites identified for the first time and which represent novel identifications should include sufficient evidence for full stereochemical structural identification and acceptable criteria are clearly defined by most journals (i.e. http://pubs.acs.org/journals/jacst/, and http://www.rsc.org/Publishing/ReSourCe/AuthorGuidelines/ArticleLayout/sect3.asp, https://paragon.acs.org/paragon/ShowDocServlet?contentId=paragon/menu_content/authorchecklist/CCCmk1.xls). This traditionally involves extraction, isolation, and purification followed by elemental analysis, accurate mass measurement, ion mass fragmentation patterns, NMR (1H, 13C, 2D), and other spectral data such as IR, UV, or chemical derivatization. The CAWG fully supports these traditional criteria for novel metabolite identifications.

2.9.3 Nomenclature for novel metabolites

For novel metabolites identified for the first time and/or compounds that are not yet included in PubChem (http://pubchem.ncbi.nlm.nih.gov/), formal naming should be consistent with IUPAC nomenclature and common naming is left to the author’s discretion. However, author(s) are encouraged to (a) submit novel structures to PubChem and/or (b) release an electronic code for the structure, i.e. the INCHI code that is recommended by IUPAC and NIST. The INCHI code and software to generate this code for chemical drawings is freely available (http://inchi.info/software\_en.html).

2.10 Proposed minimum metadata relative to reporting of unknown metabolites

Within most metabolomics datasets, there are typically many unknown analytes, i.e. level 3 and 4 compounds. Obviously, those deemed highly important to the study should be rigorously identified according to the metabolite identification discussions above. This is not possible in all cases due to time restrictions or the lack of authentic material for unambiguous assignment. However, these unknown metabolites can often still be differentiated based upon unique experimental data, i.e. spectral or chromatographic features, and it is valuable to systematically report such “unique unknowns” in a meaningful manner to other researchers. The following minimum reporting standards are suggested for systematically naming unidentified metabolites.

2.10.1 Nomenclature for unknown metabolites

3 Discussions and conclusions

The Chemical Analysis Working Group will continue to work cooperatively on a consensus document that describes a minimum core set of necessary data related to the chemical analyses associated with metabolomics experiments. Further, the CAWG will work cooperatively with other MSI groups to build an integrated consensus document. The primary motivation is to establish acceptable practices that will maximize the utility, validity, and understanding of metabolomics data. It is envisioned that the proposed MSI minimum reporting standards will eventually lead to the generation of a schematic representation and model of the reporting standards to assist potential users and developers to better understand, evaluate, and utilize the proposed metadata. However, it is the general consensus of the MSI working groups that it is still a little early for this effort and additional input is needed prior to this next step. During the interim, the MSI Exchange format working group has initiated efforts to define data exchange formats and to produce a schema for such operations that cover all aspects of the metadata, the analytical data (both spectroscopic and chromatographic) and the data analysis.

The above proposed standards do not cover all aspects of chemical analysis. Significant input is still needed within the specific areas of capillary electrophoresis, electrochemical detection, and numerous other techniques. There are also specialist areas of the mass spectrometry and NMR spectroscopy sections which may need revision or expansion to cover future consideration (e.g. in vivo NMR spectroscopy). However, we believe that the above texts provide general guidelines for improving the quality and utility of published metabolomics datasets. To achieve this objective, the CAWG invites feedback and input from the greater scientific community on the technologies and standards, and an internet discussion site has been established at http://msi-workgroups.sourceforge.net/ or http://Msi-workgroups-feedback@lists.sourceforge.net to facilitate such feedback. Only through active community involvement will a functional solution be achieved.

References

Download references

Author information

Authors and Affiliations

  1. The Samuel Roberts Noble Foundation, Ardmore, OK, USA
    Lloyd W. Sumner
  2. Sanofi-Aventis Deutschland GmbH, Frankfurt, Germany
    Alexander Amberg
  3. Centre for Analytical Bioscience, School of Pharmacy, University of Nottingham, Nottingham, UK
    Dave Barrett
  4. National Centre for Plant and Microbial Metabolomics, Rothamsted Research, West Common, Harpenden, Herts, UK
    Michael H. Beale
  5. National Center for Toxicological Research, Jefferson, AR, USA
    Richard Beger
  6. Division of Molecular and Cellular Science, School of Pharmacy, University of Nottingham, Nottingham, UK
    Clare A. Daykin
  7. Department of Chemistry, University of Louisville, Louisville, KY, USA
    Teresa W.-M. Fan & Richard Higashi
  8. UC Davis Genome Center, University of California, Davis, CA, USA
    Oliver Fiehn
  9. School of Chemistry and Manchester Interdisciplinary Biocentre, The University of Manchester, Manchester, UK
    Royston Goodacre
  10. The Department of Biochemistry, University of Cambridge, Cambridge, UK
    Julian L. Griffin
  11. Division Analytical Biosciences, Leiden University, Leiden, The Netherlands
    Thomas Hankemeier
  12. Department of Computer Science, University of Wales Aberystwyth, Aberystwyth, UK
    Nigel Hardy
  13. Food Composition and Methods Laboratory, Beltsville Human Nutrition Research Center, Agricultural Research Service, U.S. Department of Agriculture, Beltsville, MD, USA
    James Harnly
  14. Max Planck Institute of Molecular Plant Physiology, Golm, Germany
    Joachim Kopka
  15. James Graham Brown Cancer Center, University of Louisville, Louisville, KY, USA
    Andrew N. Lane
  16. Department of Biomolecular Medicine, Imperial College London, London, UK
    John C. Lindon
  17. School of Applied Sciences, RMIT University, Melbourne, Australia
    Philip Marriott
  18. Investigative Preclinical Toxicology, GlaxoSmithKline, Ware, UK
    Andrew W. Nicholls
  19. Discovery Biomarkers, Pfizer Global R&D, Ann Arbor, MI, USA
    Michael D. Reily
  20. College of Medicine, University of Arkansas for Medical Sciences, Little Rock, AR, USA
    John J. Thaden
  21. School of Biosciences, The University of Birmingham, Birmingham, UK
    Mark R. Viant

Authors

  1. Lloyd W. Sumner
    You can also search for this author inPubMed Google Scholar
  2. Alexander Amberg
    You can also search for this author inPubMed Google Scholar
  3. Dave Barrett
    You can also search for this author inPubMed Google Scholar
  4. Michael H. Beale
    You can also search for this author inPubMed Google Scholar
  5. Richard Beger
    You can also search for this author inPubMed Google Scholar
  6. Clare A. Daykin
    You can also search for this author inPubMed Google Scholar
  7. Teresa W.-M. Fan
    You can also search for this author inPubMed Google Scholar
  8. Oliver Fiehn
    You can also search for this author inPubMed Google Scholar
  9. Royston Goodacre
    You can also search for this author inPubMed Google Scholar
  10. Julian L. Griffin
    You can also search for this author inPubMed Google Scholar
  11. Thomas Hankemeier
    You can also search for this author inPubMed Google Scholar
  12. Nigel Hardy
    You can also search for this author inPubMed Google Scholar
  13. James Harnly
    You can also search for this author inPubMed Google Scholar
  14. Richard Higashi
    You can also search for this author inPubMed Google Scholar
  15. Joachim Kopka
    You can also search for this author inPubMed Google Scholar
  16. Andrew N. Lane
    You can also search for this author inPubMed Google Scholar
  17. John C. Lindon
    You can also search for this author inPubMed Google Scholar
  18. Philip Marriott
    You can also search for this author inPubMed Google Scholar
  19. Andrew W. Nicholls
    You can also search for this author inPubMed Google Scholar
  20. Michael D. Reily
    You can also search for this author inPubMed Google Scholar
  21. John J. Thaden
    You can also search for this author inPubMed Google Scholar
  22. Mark R. Viant
    You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence toLloyd W. Sumner.

Additional information

The contents of this paper do not necessarily reflect any position of the Government or the opinion of the Food and Drug Administration

Sponsor: Metabolomics Society http://www.metabolomicssociety.org/

Reference: http://msi-workgroups.sourceforge.net/bio-metadata/reporting/pbc/

http://msi-workgroups.sourceforge.net/chemical-analysis/

Version: Revision: 5.1

Date: 09 January, 2007

Rights and permissions

About this article

Cite this article

Sumner, L.W., Amberg, A., Barrett, D. et al. Proposed minimum reporting standards for chemical analysis.Metabolomics 3, 211–221 (2007). https://doi.org/10.1007/s11306-007-0082-2

Download citation

Keywords