Discovery-based analysis and quantification for comprehensive three-dimensional gas chromatography flame ionization detection data (original) (raw)
Related papers
Journal of Chromatography A, 2020
Tile-based Fisher ratio (F-ratio) analysis has recently been developed and validated for discovery-based studies of highly complex data collected using comprehensive two-dimensional gas chromatography coupled with time-of-flight mass spectrometry (GC ×GC-TOFMS). In previous studies, interpretation and utilization of F-ratio hit lists has relied upon manual decomposition and quantification performed by chemometric methods such as parallel factor analysis (PARAFAC), or via manual translation of the F-ratio hit list information to peak table quantitative information provided by the instrument software (ChromaTOF). Both of these quantification approaches are bottlenecks in the overall workflow. In order to address this issue, a more automatable approach to provide accurate relative quantification for F-ratio analyses was investigated, based upon the mass spectral selectivity provided via the F-ratio spectral output. Diesel fuel spiked with 15 analytes at four concentration levels (80, 40, 20, and 10 ppm) produced three sets of two class comparisons that were submitted to tile-based F-ratio analysis to obtain three hit lists, with an F-ratio spectrum for each hit. A novel algorithm which calculates the signal ratio (S-ratio) between two classes (eg., 80 ppm versus 40 ppm) was applied to all mass channels (m/z) in the F-ratio spectrum for each hit. A lack of fit (LOF) metric was utilized as a measure of peak purity and combined with Fratio and p-values to study the relationship of each of these metrics with m/z purity. Application of a LOF threshold coupled with a p-value threshold yielded a subset of the most pure m/z for each of the 15 spiked analytes, evident by the low deviations (< 5%) in S-ratio relative to the true concentration ratio. A key outcome of this study was to demonstrate the isolation of pure m/z without the need for higher level signal decomposition algorithms.
Analytical Chemistry, 2019
Organic compound characterization of highly complex matrices involves scientific challenges such as diversity of "true" unknowns, concentration ranges of various compound classes and limited, available amounts of sample. Therefore, discoverybased multidimensional gas chromatography coupled to high-resolution time-of-flight mass spectrometry (GCxGC-HRToFMS) is increasingly applied. Nevertheless, most studies focus on target analysis and tend to disregard important details of the sample composition. The increased peak, or separation, capacity of GCxGC-ToFMS allows for in-depth chemical analysis of the molecular composition. However, high amounts of data, containing several thousands of compounds per experiment, are generally acquired during such analyses. Coupling GCxGC to high-resolution mass spectrometry further increases the amount of data and therefore requires advanced data reduction and mining techniques. Commonly, the main approach for the evaluation of GC×GC-HRToFMS data sets either focuses on the chromatographic separation (e.g. group type analysis), or utilizes exact mass data applying Kendrick Mass Defect analysis or van Krevelen plots. The presented approach integrates the accurate mass data and the chromatographic information by combining Kendrick Mass Defect information and knowledge-based rules. This combination allows for fast, visual data screening as well as quantitative estimation of the sample's composition. Moreover, the resulted sample classification significantly reduces the number of variables, allowing distinct chemometric analysis in non-targeted studies such as detailed hydrocarbon analysis, environmental and forensic investigations. ASSOCIATED CONTENT Supporting Information Composition of the performance evaluation mixture, initial data processing parameters, criteria for the developed and evaluated classes, suggestion for additional classes, mass spectral similarity between naphthenes and alkenes, TPR vs FPR plots, ROC discrimination Diesel and Kerosene, ROC weathered Diesel
Analytica Chimica Acta, 2020
Gas chromatography (GC) is undoubtedly the analytical technique of choice for compositional analysis of petroleum-based fuels. Over the past twenty years, as comprehensive two-dimensional gas chromatography (GC×GC) has evolved, fuel analysis has often been highlighted in scientific reports, since the complexity of fuel analysis allows for illustration of the impressive peak capacity gains afforded by GC×GC. Indeed, several research groups in recent years have applied GC×GC and chemometric data analysis to demonstrate the potential of these analytical tools to address important compliance (tax evasion, tax credits, physical quality standards) and forensic (arson investigations, oil spills) applications involving fuels. None the less, routine use of GC×GC in forensic laboratories has been limited largely by (1) legal and regulatory guidelines, (2) lack of chemometrics training, and (3) concerns about the reproducibility of GC×GC. The goal of this review is to highlight recent advances in onedimensional GC (1D-GC) and GC×GC analyses of fuels for compliance and forensic applications, to assist scientists in overcoming the aforementioned hindrances. An introduction to 1D-GC principles, GC×GC technology (column stationary phases and modulators) and several chemometric methods is provided. More specifically, chemometric methods will be broken down into (1) signal pre-processing, (2) peak decomposition, identification and quantification, and (3) classification and pattern recognition. Examples of compliance and forensic applications will be discussed with particular emphasis on the demonstrated success of the employed chemometric methods. This review will hopefully make 1D-GC and GC×GC coupled with chemometric data analysis tools more accessible to the larger scientific community, and aid in eventual widespread standardization.
Chromatographic preprocessing of GC–MS data for analysis of complex chemical mixtures
Journal of Chromatography A, 2005
Hyphenated analytical techniques such as gas chromatography-mass spectrometry (GC-MS) can provide extensive amounts of analytical data when applied to environmental samples. Quantitative analyses of complex contaminant mixtures by commercial preprocessing software are time-consuming, and baseline distortion and incomplete peak resolution increase the uncertainty and subjectivity of peak quantification. Here, we present a semi-automatic method developed specific for processing complex first-order chromatographic data (e.g. selected ion monitoring in GC-MS) prior to chemometric data analysis. Chromatograms are converted into semi-quantitative variables (e.g. diagnostic ratios (DRs)) that can be exported directly to appropriate softwares. The method is based on automatic peak matching, initial parameterization, alternating background noise reduction and peak estimation using mathematical functions (Gaussian and exponential-Gaussian hybrid) with few (i.e. three to four) parameters. It is capable of resolving convoluted peaks, and the exponential-Gaussian hybrid improves the description of asymmetric peaks (i.e. fronting and tailing). The optimal data preprocessing suggested in this article consists of estimation of Gaussian peak parameters and subsequent calculation of diagnostic ratios from peak heights. We tested the method on chromatographic data from 20 replicate oil samples and found it to be less time-consuming and subjective than commercial software, and with comparable data quality.
Journal of Chromatography A, 2013
The present manuscript is focused on the evaluation of a novel high-speed triple quadrupole mass spectrometer (QqQ MS), carried out under extreme GC conditions, namely those generated by a flowmodulated (FM) comprehensive two-dimensional GC (GC × GC) system. The novel QqQ MS system is capable of operation under high speed conditions, in both full-scan (maximum scan speed: 20,000 amu/s) and multiple reaction monitoring (MRM) modes. Moreover, the QqQ MS instrument can generate simultaneous full scan/MRM data, also in a very rapid manner. An FM GC × GC-MSMS method was developed for the simultaneous full-scan qualitative analysis of untargeted essential oil compounds, and MRM quali/quantitative analysis of targeted ones, namely three preservatives [o-phenylphenol (OPP), butylated hydroxytoluene (BHT), butylated hydroxyanisole (BHA)]. The QqQMS system generated a sufficient number of data points per peak, for both qualitative and quantitative purposes. The degree of sensitivity, reached through the MRM analysis, widely exceeded current-day regulations. Method validation, related to the MRM analysis, was performed considering retention time, peak area and ion ratio repeatability, limits of detection and quantification, and accuracy. Additionally, a spearmint essential oil was spiked with 5 phytosanitary compounds at the 1 ppb level, and analysed through an MRM-only GC × GC-MSMS application. Emphasis was devoted not only on sensitivity (satisfactory for all the contaminants), but also on the importance of precursor ion selection, and of the GC × GC separation process. Finally, sensitivity was compared between the MRM and SIM modes, in scan/MRM, MRM, scan/SIM and SIM analyses, performed on a mixture of 22 phytosanitary products, at a concentration level in the 50-150 ppb range.
Analytica Chimica Acta, 2003
The two-dimensional (2D) data structure generated under a high resolution GC × GC system with a small number of samplings taken across the first dimension is evaluated for the purpose of the application of chemometric deconvolution methods. Chemometric techniques such as generalized rank annihilation method (GRAM) place high demands on the reproducibility of chromatographic experiments. For GRAM to be employed for GC × GC data interpretation, it is critical that the separation method provides data with a bilinear structure; the peak-shape and retention times on both columns must be reproducible. With a limited number of samplings across a 1 D (first dimension) peak (e.g. four to six samplings) repeatability of the pattern of the modulated peaks (controlled by the modulation phase) becomes important in producing a bilinear data structure. Reproducibility of modulation phase can be affected by both reliability of the modulation period and reproducibility of the retention time of the peak on the first column (which arises from oven temperature and carrier flow rate stability). Evaluation of within-run and run-to-run retention time reproducibility (retention time uncertainty) on both columns, and modulation phase reproducibility using a modulated cryogenic system for a pair of overlapping components (fatty acid methyl esters) was undertaken. An investigation of the quality of data to permit quantification of each component by using GRAM deconvolution, was also conducted. Less than 4% run-to-run retention time uncertainty was obtained on column 1 and less than 9% run-to-run and within-run retention time uncertainty was obtained on column 2, where these R.S.D. measures are reported normalised to peak widths on each respective dimension. The R.S.D. of duplicate quantification results by GRAM ranged from 2 to 26% although the average quantification error using GRAM was less than 5%.