Frank Westad - Profile on Academia.edu (original) (raw)

Papers by Frank Westad

Research paper thumbnail of Data Pre-processing and Sensor-Fusion for Multivariate Statistical Process Control of an Extrusion Process

Data Pre-processing and Sensor-Fusion for Multivariate Statistical Process Control of an Extrusion Process

Research paper thumbnail of A retrospective view on non-linear methods in chemometrics, and future directions

Frontiers in analytical science, May 24, 2024

This perspective article reviews how the chemometrics community approached non-linear methods in ... more This perspective article reviews how the chemometrics community approached non-linear methods in the early years. In addition to the basic chemometric methods, some methods that fall under the term "machine learning" are also mentioned. Thereafter, types of non-linearity are briefly presented, followed by discussions on important aspects of modeling related to non-linear data. Lastly, a simulated data set with non-linear properties is analyzed for quantitative prediction and batch monitoring. The conclusion is that the latent variable methods to a large extent handle non-linearities by adding more linear combinations of the original variables. Nevertheless, with strong nonlinearities between the X and Y space, non-linear methods such as Support Vector Machines might improve prediction performance at the cost of interpretability into both the sample and variable space. Applying multiple local models can improve performance compared to a single global model, of both linear and non-linear nature. When non-linear methods are applied, the need for conservative model validation is even more important. Another approach is pre-processing of the data which can make the data more linear before the actual modeling and prediction phase.

Research paper thumbnail of Independent Component Analysis

Independent Component Analysis

Elsevier eBooks, 2009

ABSTRACT This chapter presents the concept and theory of independent component analysis (ICA). Th... more ABSTRACT This chapter presents the concept and theory of independent component analysis (ICA). The method originated from signal processing research, where unknown signal sources are mixed to a new set of signals. This general objective of separating signals into pure sources is called blind source separation (BSS). ICA has been shown to be useful in solving the BSS problem, and if the pure sources are found, then also the mixing system may be estimated. Similar situations occur in chemometrics where the pure spectra of chemical compounds and their concentrations are observed with different type of instrumentation such as spectroscopy. The necessity of proper validation in ICA is emphasized and put into a chemometric framework. Examples on simulated as well as spectroscopic data are shown to illustrate the potential of ICA in chemometric applications.

Research paper thumbnail of Multivariate Statistical Process Control and Classification Applied on Prostate Cancer Screening

Journal of biomedical research & environmental sciences, Jun 1, 2023

Journal of Biomedical Research & Environmental Sciences main aim is to enhance the importance of ... more Journal of Biomedical Research & Environmental Sciences main aim is to enhance the importance of science and technology to the scientifi c community and also to provide an equal opportunity to seek and share ideas to all our researchers and scientists without any barriers to develop their career and helping in their development of discovering the world.

Research paper thumbnail of Variable Selection and Redundancy in Multivariate Regression Models

Frontiers in analytical science, Jun 9, 2022

Variable selection is a topic of interest in many scientific communities. Within chemometrics, wh... more Variable selection is a topic of interest in many scientific communities. Within chemometrics, where the number of variables for multi-channel instruments like NIR spectroscopy and metabolomics in many situations is larger than the number of samples, the strategy has been to use latent variable regression methods to overcome the challenges with multiple linear regression. Thereby, there is no need to remove variables as such, as the low-rank models handle collinearity and redundancy. In most studies on variable selection, the main objective was to compare the prediction performance (RMSE or accuracy in classification) between various methods. Nevertheless, different methods with the same objective will, in most cases, give results that are not significantly different. In this study, we present three other main objectives: i) to eliminate variables that are not relevant; ii) to return a small subset of variables that has the same or better prediction performance as a model with all original variables; and iii) to investigate the consistency of these small subsets.

Research paper thumbnail of An Overview of Chemometrics for the Engineering and Measurement Sciences

An Overview of Chemometrics for the Engineering and Measurement Sciences

John Wiley & Sons, Inc. eBooks, Jul 1, 2016

Research paper thumbnail of Probeless non-invasive near-infrared spectroscopic bioprocess monitoring using microspectrometer technology

Analytical and Bioanalytical Chemistry, Dec 4, 2019

Real-time measurements and adjustments of critical process parameters are essential for the preci... more Real-time measurements and adjustments of critical process parameters are essential for the precise control of fermentation processes and thus for increasing both quality and yield of the desired product. However, the measurement of some crucial process parameters such as biomass, product, and product precursor concentrations usually requires time-consuming offline laboratory analysis. In this work, we demonstrate the in-line monitoring of biomass, penicillin (PEN), and phenoxyacetic acid (POX) in a Penicillium chrysogenum fed-batch fermentation process using low-cost microspectrometer technology operating in the near-infrared (NIR). In particular, NIR reflection spectra were taken directly through the glass wall of the bioreactor, which eliminates the need for an expensive NIR immersion probe. Furthermore, the risk of contaminations in the reactor is significantly reduced, as no direct contact with the investigated medium is required. NIR spectra were acquired using two sensor modules covering the spectral ranges 1350-1650 nm and 1550-1950 nm. Based on offline reference analytics, partial least squares (PLS) regression models were established for biomass, PEN, and POX either using data from both sensors separately or jointly. The established PLS models were tested on an independent validation fed-batch experiment. Root mean squared errors of prediction (RMSEP) were 1.61 g/L, 1.66 g/L, and 0.67 g/L for biomass, PEN, and POX, respectively, which can be considered an acceptable accuracy comparable with previously published results using standard process spectrometers with immersion probes. Altogether, the presented results underpin the potential of low-cost microspectrometer technology in real-time bioprocess monitoring applications.

Research paper thumbnail of Effect of acetic acid, caproic acid and tryptamine on voluntary intake of grass silage by growing cattle

Grass and Forage Science, Feb 14, 2012

The objective of this study was to identify and quantify fermentation end-products, detected with... more The objective of this study was to identify and quantify fermentation end-products, detected with chromatographic techniques, that were negatively related to intake of grass silage by cattle. Further, the aim was to verify the intake-depressing effect of these compounds in a feeding trial. A set of twenty-four silages that had been used in a previous study to model variations in intake owing to fermentation quality was reanalysed with liquid and gas chromatography. Known and unknown chromatogram peaks were subjected to a regression analysis to determine which were negatively related to intake. Compounds were identified and quantified using a liquid chromatography-mass spectrometry system; acetic acid (AcA), caproic acid and tryptamine were chosen for verification. Growing steers were offered wilted silage with these three compounds added, separately or as a mixture, in proportions similar to the maximum values detected in the silages of the previous study. Dietary addition of AcA, either separately or mixed with the other two compounds, reduced silage dry matter (DM) intake. However, the reduction in silage DM intake equalled the amount provided by the added substances, so that no differences in total DM intake were observed for any of the dietary treatments.

Research paper thumbnail of Independent Component Analysis

Independent Component Analysis

Elsevier eBooks, 2009

Research paper thumbnail of Relevance and Parsimony in Multivariate Modelling

Relevance and Parsimony in Multivariate Modelling

Research paper thumbnail of O2-PLS regression and jack-knife based variable selection

O2-PLS regression and jack-knife based variable selection

Research paper thumbnail of Online spectroscopy in microemulsions – A process analytical approach for hydroformylation miniplant II - Calibration and prediction by Raman spectra

Online spectroscopy in microemulsions – A process analytical approach for hydroformylation miniplant II - Calibration and prediction by Raman spectra

The Collaborative Research Center InPROMPT aims to establish a novel process concept for the hydr... more The Collaborative Research Center InPROMPT aims to establish a novel process concept for the hydroformylation of long-chained olefins, using a rhodium complex as catalyst in the presence of syngas. Recently, the hydroformylation in micro-emulsions, which allows for the efficient recycling of the expensive rhodium catalyst, was found to be feasible. However, the temperature and concentration sensitive multi-phase system demands a continuous observation of the reaction to achieve an operational and economically feasible plant operation. For that purpose, we tested the potential of both NMR and Raman spectroscopy for process control assistance. The lab-scale experiments were supported by sampling for off-line GC-analysis as reference analytics. The results of the NMR experiments will be part of another contribution.

Research paper thumbnail of Online spectroscopy in microemulsions – A process analytical approach for a hydroformylation mini-plant

Online spectroscopy in microemulsions – A process analytical approach for a hydroformylation mini-plant

Within the Collaborative Research Center InPROMPT a novel process concept for the hydroformylatio... more Within the Collaborative Research Center InPROMPT a novel process concept for the hydroformylation of long-chained olefins is studied in a mini-plant, using a rhodium complex as catalyst in the presence of syngas. Recently, the hydroformylation in micro¬emulsions, which allows for the efficient recycling of the expensive rhodium catalyst, was found to be feasible. However, the high sensitivity of this multi-phase system with regard to changes in temperature and composition demands a continuous observation of the reaction to achieve a reliable and economic plant operation. For that purpose, we tested the potential of both online NMR and Raman spectroscopy for process control. The lab-scale experiments were supported by off-line GC-analysis as a reference method. A fiber optic coupled probe of a process Raman spectrometer was directly integrated into the reactor. 25 mixtures with varying concentrations of olefin (1-dodecene), product (n-tridecanal), water, n-dodecane, and technical surfactant (Marlipal 24/70) were prepared according to a D-optimal design. Online NMR spectroscopy was implemented by using a flow probe equipped with 1/16” PFA tubing serving as a flow cell. This was hyphenated to the reactor within a thermostated bypass to maintain process conditions in the transfer lines. Partial least squares regression (PLSR) models were established based on the initial spectra after activation of the reaction with syngas for the prediction of unknown concentrations of 1-dodecene and n-tridecanal over the course of the reaction in the lab-scale system. The obtained Raman spectra do not only contain information on the chemical composition but are further affected by the emulsion properties of the mixtures, which depend on the phase state and the type of micelles. Based on the spectral signature of both Raman and NMR spectra, it could be deduced that especially in reaction mixtures with high 1-dodecene content the formation of isomers as a competitive reaction was dominating. Similar trends were also observed during some of the process runs in the mini-plant. The multivariate calibration allowed for the estimation of reactants and products of the hydroformylation reaction in both laboratory setup and mini-plant.

Research paper thumbnail of CasTGAN: Cascaded Generative Adversarial Network for Realistic Tabular Data Synthesis

arXiv (Cornell University), Jul 1, 2023

Generative adversarial networks (GANs) have drawn considerable attention in recent years for thei... more Generative adversarial networks (GANs) have drawn considerable attention in recent years for their proven capability in generating synthetic data which can be utilized for multiple purposes. While GANs have demonstrated tremendous successes in producing synthetic data samples that replicate the dynamics of the original datasets, the validity of the synthetic data and the underlying privacy concerns represent major challenges which are not sufficiently addressed. In this work, we design a cascaded tabular GAN framework (CasTGAN) for generating realistic tabular data with a specific focus on the validity of the output. In this context, validity refers to the the dependency between features that can be found in the real data, but is typically misrepresented by traditional generative models. Our key idea entails that employing a cascaded architecture in which a dedicated generator samples each feature, the synthetic output becomes more representative of the real data. Our experimental results demonstrate that our model well captures the constraints and the correlations between the features of the real data, especially the high dimensional datasets. Furthermore, we evaluate the risk of white-box privacy attacks on our model and subsequently show that applying some perturbations to the auxiliary learners in CasTGAN increases the overall robustness of our model against targeted attacks.

Research paper thumbnail of In Vivo Analysis of a Biodegradable Magnesium Alloy Implant in an Animal Model Using Near-Infrared Spectroscopy

Sensors

Biodegradable magnesium-based implants offer mechanical properties similar to natural bone, makin... more Biodegradable magnesium-based implants offer mechanical properties similar to natural bone, making them advantageous over nonbiodegradable metallic implants. However, monitoring the interaction between magnesium and tissue over time without interference is difficult. A noninvasive method, optical near-infrared spectroscopy, can be used to monitor tissue’s functional and structural properties. In this paper, we collected optical data from an in vitro cell culture medium and in vivo studies using a specialized optical probe. Spectroscopic data were acquired over two weeks to study the combined effect of biodegradable Mg-based implant disks on the cell culture medium in vivo. Principal component analysis (PCA) was used for data analysis. In the in vivo study, we evaluated the feasibility of using the near-infrared (NIR) spectra to understand physiological events in response to magnesium alloy implantation at specific time points (Day 0, 3, 7, and 14) after surgery. Our results show tha...

Research paper thumbnail of Automated Well-Log Depth Matching – 1D Convolutional Neural Networks Vs. Classic Cross Correlation

Petrophysics – The SPWLA Journal of Formation Evaluation and Reservoir Description, 2022

During drilling and logging, depth alignment of well logs acquired in the same borehole section a... more During drilling and logging, depth alignment of well logs acquired in the same borehole section at different times is a vital preprocessing step before any petrophysical analysis. Depth alignment requires high precision as depth misalignment between different log curve measurements can substantially suppress possible correlations between formation properties, leading to imprecise interpretation or even misinterpretation. Standard depth alignment involves cross correlation, which typically requires user intervention for reliability. To improve the depth alignment process, we apply deep-learning techniques and propose a simple and practical implementation of a one-dimensional (1D) supervised convolutional neural network (1D CNN). We train seven CNN models using different log measurements, such as gamma ray, resistivity, P- and S-wave sonic, density, neutron, and photoelectric factor (PEF), to estimate depth mismatches between the corresponding raw logging-while-drilling (LWD) and elec...

Research paper thumbnail of Automated Log Data Analytics Workflow – The Value of Data Access and Management to Reduced Turnaround Time for Log Analysis

Petrophysics – The SPWLA Journal of Formation Evaluation and Reservoir Description, 2022

The oil and gas industry of today is undergoing rapid digitalization. This implies a massive effo... more The oil and gas industry of today is undergoing rapid digitalization. This implies a massive effort to transform standard work procedures and workflows into more efficient practices and implementations using machine learning (ML) and automation. This will enable geoscientists to explore and exploit vast amounts of data quickly and efficiently. To address these current industry challenges, we propose a pilot well-log database in HDF5 (Hierarchical Data Format version 5) format that can be continuously extended if new data become available. It also provides versatility for data preparation for further analysis. We show an alternative way to store and use log files in a hierarchical structure that is easy to understand and handle by research institutes, companies, and academia. We also touch upon well-log depth matching, a long-standing industry challenge, to synchronize data from different logging passes to a single depth reference. Having a robust automated solution for depth matchin...

Research paper thumbnail of Automation of depth matching using a structured well-log database: Prototype well example in the North Sea

Automation of depth matching using a structured well-log database: Prototype well example in the North Sea

SEG Technical Program Expanded Abstracts 2020, 2020

Research paper thumbnail of A retrospective look at cross model validation and its applicability in vibrational spectroscopy

A retrospective look at cross model validation and its applicability in vibrational spectroscopy

Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, Jul 1, 2021

In this paper, it is presented how Cross Model Validation (CMV), also known as double cross valid... more In this paper, it is presented how Cross Model Validation (CMV), also known as double cross validation, efficiently can be applied for variable selection in spectroscopic applications. The chosen applications are FT-IR spectroscopic measurements of mixtures of marzipan and NIR spectra of diesel fuels. Standard Normal Variate (SNV) is applied as a spectral pre-treatment to reduce baseline effects in the spectra for the FT-IR data whereas 2nd derivative was applied for the diesel fuels. Variable selection based on jack-knifing and frequency of significance from Cross Model Validation is employed for identifying non-relevant spectral regions as well as providing a relevant subset for model optimization. The results show a high degree of correspondence between the objectively found wavelength bands and the reported chemical interpretation found in the literature. In addition, the stability of the models due to conservative validation with respect to predictive performance is exemplified. Finally, an example of how the use of downweighing variables ensures optimal prediction ability and detailed model interpretation is shown.

Research paper thumbnail of Independent Component Analysis

Independent Component Analysis

Research paper thumbnail of Data Pre-processing and Sensor-Fusion for Multivariate Statistical Process Control of an Extrusion Process

Data Pre-processing and Sensor-Fusion for Multivariate Statistical Process Control of an Extrusion Process

Research paper thumbnail of A retrospective view on non-linear methods in chemometrics, and future directions

Frontiers in analytical science, May 24, 2024

This perspective article reviews how the chemometrics community approached non-linear methods in ... more This perspective article reviews how the chemometrics community approached non-linear methods in the early years. In addition to the basic chemometric methods, some methods that fall under the term "machine learning" are also mentioned. Thereafter, types of non-linearity are briefly presented, followed by discussions on important aspects of modeling related to non-linear data. Lastly, a simulated data set with non-linear properties is analyzed for quantitative prediction and batch monitoring. The conclusion is that the latent variable methods to a large extent handle non-linearities by adding more linear combinations of the original variables. Nevertheless, with strong nonlinearities between the X and Y space, non-linear methods such as Support Vector Machines might improve prediction performance at the cost of interpretability into both the sample and variable space. Applying multiple local models can improve performance compared to a single global model, of both linear and non-linear nature. When non-linear methods are applied, the need for conservative model validation is even more important. Another approach is pre-processing of the data which can make the data more linear before the actual modeling and prediction phase.

Research paper thumbnail of Independent Component Analysis

Independent Component Analysis

Elsevier eBooks, 2009

ABSTRACT This chapter presents the concept and theory of independent component analysis (ICA). Th... more ABSTRACT This chapter presents the concept and theory of independent component analysis (ICA). The method originated from signal processing research, where unknown signal sources are mixed to a new set of signals. This general objective of separating signals into pure sources is called blind source separation (BSS). ICA has been shown to be useful in solving the BSS problem, and if the pure sources are found, then also the mixing system may be estimated. Similar situations occur in chemometrics where the pure spectra of chemical compounds and their concentrations are observed with different type of instrumentation such as spectroscopy. The necessity of proper validation in ICA is emphasized and put into a chemometric framework. Examples on simulated as well as spectroscopic data are shown to illustrate the potential of ICA in chemometric applications.

Research paper thumbnail of Multivariate Statistical Process Control and Classification Applied on Prostate Cancer Screening

Journal of biomedical research & environmental sciences, Jun 1, 2023

Journal of Biomedical Research & Environmental Sciences main aim is to enhance the importance of ... more Journal of Biomedical Research & Environmental Sciences main aim is to enhance the importance of science and technology to the scientifi c community and also to provide an equal opportunity to seek and share ideas to all our researchers and scientists without any barriers to develop their career and helping in their development of discovering the world.

Research paper thumbnail of Variable Selection and Redundancy in Multivariate Regression Models

Frontiers in analytical science, Jun 9, 2022

Variable selection is a topic of interest in many scientific communities. Within chemometrics, wh... more Variable selection is a topic of interest in many scientific communities. Within chemometrics, where the number of variables for multi-channel instruments like NIR spectroscopy and metabolomics in many situations is larger than the number of samples, the strategy has been to use latent variable regression methods to overcome the challenges with multiple linear regression. Thereby, there is no need to remove variables as such, as the low-rank models handle collinearity and redundancy. In most studies on variable selection, the main objective was to compare the prediction performance (RMSE or accuracy in classification) between various methods. Nevertheless, different methods with the same objective will, in most cases, give results that are not significantly different. In this study, we present three other main objectives: i) to eliminate variables that are not relevant; ii) to return a small subset of variables that has the same or better prediction performance as a model with all original variables; and iii) to investigate the consistency of these small subsets.

Research paper thumbnail of An Overview of Chemometrics for the Engineering and Measurement Sciences

An Overview of Chemometrics for the Engineering and Measurement Sciences

John Wiley & Sons, Inc. eBooks, Jul 1, 2016

Research paper thumbnail of Probeless non-invasive near-infrared spectroscopic bioprocess monitoring using microspectrometer technology

Analytical and Bioanalytical Chemistry, Dec 4, 2019

Real-time measurements and adjustments of critical process parameters are essential for the preci... more Real-time measurements and adjustments of critical process parameters are essential for the precise control of fermentation processes and thus for increasing both quality and yield of the desired product. However, the measurement of some crucial process parameters such as biomass, product, and product precursor concentrations usually requires time-consuming offline laboratory analysis. In this work, we demonstrate the in-line monitoring of biomass, penicillin (PEN), and phenoxyacetic acid (POX) in a Penicillium chrysogenum fed-batch fermentation process using low-cost microspectrometer technology operating in the near-infrared (NIR). In particular, NIR reflection spectra were taken directly through the glass wall of the bioreactor, which eliminates the need for an expensive NIR immersion probe. Furthermore, the risk of contaminations in the reactor is significantly reduced, as no direct contact with the investigated medium is required. NIR spectra were acquired using two sensor modules covering the spectral ranges 1350-1650 nm and 1550-1950 nm. Based on offline reference analytics, partial least squares (PLS) regression models were established for biomass, PEN, and POX either using data from both sensors separately or jointly. The established PLS models were tested on an independent validation fed-batch experiment. Root mean squared errors of prediction (RMSEP) were 1.61 g/L, 1.66 g/L, and 0.67 g/L for biomass, PEN, and POX, respectively, which can be considered an acceptable accuracy comparable with previously published results using standard process spectrometers with immersion probes. Altogether, the presented results underpin the potential of low-cost microspectrometer technology in real-time bioprocess monitoring applications.

Research paper thumbnail of Effect of acetic acid, caproic acid and tryptamine on voluntary intake of grass silage by growing cattle

Grass and Forage Science, Feb 14, 2012

The objective of this study was to identify and quantify fermentation end-products, detected with... more The objective of this study was to identify and quantify fermentation end-products, detected with chromatographic techniques, that were negatively related to intake of grass silage by cattle. Further, the aim was to verify the intake-depressing effect of these compounds in a feeding trial. A set of twenty-four silages that had been used in a previous study to model variations in intake owing to fermentation quality was reanalysed with liquid and gas chromatography. Known and unknown chromatogram peaks were subjected to a regression analysis to determine which were negatively related to intake. Compounds were identified and quantified using a liquid chromatography-mass spectrometry system; acetic acid (AcA), caproic acid and tryptamine were chosen for verification. Growing steers were offered wilted silage with these three compounds added, separately or as a mixture, in proportions similar to the maximum values detected in the silages of the previous study. Dietary addition of AcA, either separately or mixed with the other two compounds, reduced silage dry matter (DM) intake. However, the reduction in silage DM intake equalled the amount provided by the added substances, so that no differences in total DM intake were observed for any of the dietary treatments.

Research paper thumbnail of Independent Component Analysis

Independent Component Analysis

Elsevier eBooks, 2009

Research paper thumbnail of Relevance and Parsimony in Multivariate Modelling

Relevance and Parsimony in Multivariate Modelling

Research paper thumbnail of O2-PLS regression and jack-knife based variable selection

O2-PLS regression and jack-knife based variable selection

Research paper thumbnail of Online spectroscopy in microemulsions – A process analytical approach for hydroformylation miniplant II - Calibration and prediction by Raman spectra

Online spectroscopy in microemulsions – A process analytical approach for hydroformylation miniplant II - Calibration and prediction by Raman spectra

The Collaborative Research Center InPROMPT aims to establish a novel process concept for the hydr... more The Collaborative Research Center InPROMPT aims to establish a novel process concept for the hydroformylation of long-chained olefins, using a rhodium complex as catalyst in the presence of syngas. Recently, the hydroformylation in micro-emulsions, which allows for the efficient recycling of the expensive rhodium catalyst, was found to be feasible. However, the temperature and concentration sensitive multi-phase system demands a continuous observation of the reaction to achieve an operational and economically feasible plant operation. For that purpose, we tested the potential of both NMR and Raman spectroscopy for process control assistance. The lab-scale experiments were supported by sampling for off-line GC-analysis as reference analytics. The results of the NMR experiments will be part of another contribution.

Research paper thumbnail of Online spectroscopy in microemulsions – A process analytical approach for a hydroformylation mini-plant

Online spectroscopy in microemulsions – A process analytical approach for a hydroformylation mini-plant

Within the Collaborative Research Center InPROMPT a novel process concept for the hydroformylatio... more Within the Collaborative Research Center InPROMPT a novel process concept for the hydroformylation of long-chained olefins is studied in a mini-plant, using a rhodium complex as catalyst in the presence of syngas. Recently, the hydroformylation in micro¬emulsions, which allows for the efficient recycling of the expensive rhodium catalyst, was found to be feasible. However, the high sensitivity of this multi-phase system with regard to changes in temperature and composition demands a continuous observation of the reaction to achieve a reliable and economic plant operation. For that purpose, we tested the potential of both online NMR and Raman spectroscopy for process control. The lab-scale experiments were supported by off-line GC-analysis as a reference method. A fiber optic coupled probe of a process Raman spectrometer was directly integrated into the reactor. 25 mixtures with varying concentrations of olefin (1-dodecene), product (n-tridecanal), water, n-dodecane, and technical surfactant (Marlipal 24/70) were prepared according to a D-optimal design. Online NMR spectroscopy was implemented by using a flow probe equipped with 1/16” PFA tubing serving as a flow cell. This was hyphenated to the reactor within a thermostated bypass to maintain process conditions in the transfer lines. Partial least squares regression (PLSR) models were established based on the initial spectra after activation of the reaction with syngas for the prediction of unknown concentrations of 1-dodecene and n-tridecanal over the course of the reaction in the lab-scale system. The obtained Raman spectra do not only contain information on the chemical composition but are further affected by the emulsion properties of the mixtures, which depend on the phase state and the type of micelles. Based on the spectral signature of both Raman and NMR spectra, it could be deduced that especially in reaction mixtures with high 1-dodecene content the formation of isomers as a competitive reaction was dominating. Similar trends were also observed during some of the process runs in the mini-plant. The multivariate calibration allowed for the estimation of reactants and products of the hydroformylation reaction in both laboratory setup and mini-plant.

Research paper thumbnail of CasTGAN: Cascaded Generative Adversarial Network for Realistic Tabular Data Synthesis

arXiv (Cornell University), Jul 1, 2023

Generative adversarial networks (GANs) have drawn considerable attention in recent years for thei... more Generative adversarial networks (GANs) have drawn considerable attention in recent years for their proven capability in generating synthetic data which can be utilized for multiple purposes. While GANs have demonstrated tremendous successes in producing synthetic data samples that replicate the dynamics of the original datasets, the validity of the synthetic data and the underlying privacy concerns represent major challenges which are not sufficiently addressed. In this work, we design a cascaded tabular GAN framework (CasTGAN) for generating realistic tabular data with a specific focus on the validity of the output. In this context, validity refers to the the dependency between features that can be found in the real data, but is typically misrepresented by traditional generative models. Our key idea entails that employing a cascaded architecture in which a dedicated generator samples each feature, the synthetic output becomes more representative of the real data. Our experimental results demonstrate that our model well captures the constraints and the correlations between the features of the real data, especially the high dimensional datasets. Furthermore, we evaluate the risk of white-box privacy attacks on our model and subsequently show that applying some perturbations to the auxiliary learners in CasTGAN increases the overall robustness of our model against targeted attacks.

Research paper thumbnail of In Vivo Analysis of a Biodegradable Magnesium Alloy Implant in an Animal Model Using Near-Infrared Spectroscopy

Sensors

Biodegradable magnesium-based implants offer mechanical properties similar to natural bone, makin... more Biodegradable magnesium-based implants offer mechanical properties similar to natural bone, making them advantageous over nonbiodegradable metallic implants. However, monitoring the interaction between magnesium and tissue over time without interference is difficult. A noninvasive method, optical near-infrared spectroscopy, can be used to monitor tissue’s functional and structural properties. In this paper, we collected optical data from an in vitro cell culture medium and in vivo studies using a specialized optical probe. Spectroscopic data were acquired over two weeks to study the combined effect of biodegradable Mg-based implant disks on the cell culture medium in vivo. Principal component analysis (PCA) was used for data analysis. In the in vivo study, we evaluated the feasibility of using the near-infrared (NIR) spectra to understand physiological events in response to magnesium alloy implantation at specific time points (Day 0, 3, 7, and 14) after surgery. Our results show tha...

Research paper thumbnail of Automated Well-Log Depth Matching – 1D Convolutional Neural Networks Vs. Classic Cross Correlation

Petrophysics – The SPWLA Journal of Formation Evaluation and Reservoir Description, 2022

During drilling and logging, depth alignment of well logs acquired in the same borehole section a... more During drilling and logging, depth alignment of well logs acquired in the same borehole section at different times is a vital preprocessing step before any petrophysical analysis. Depth alignment requires high precision as depth misalignment between different log curve measurements can substantially suppress possible correlations between formation properties, leading to imprecise interpretation or even misinterpretation. Standard depth alignment involves cross correlation, which typically requires user intervention for reliability. To improve the depth alignment process, we apply deep-learning techniques and propose a simple and practical implementation of a one-dimensional (1D) supervised convolutional neural network (1D CNN). We train seven CNN models using different log measurements, such as gamma ray, resistivity, P- and S-wave sonic, density, neutron, and photoelectric factor (PEF), to estimate depth mismatches between the corresponding raw logging-while-drilling (LWD) and elec...

Research paper thumbnail of Automated Log Data Analytics Workflow – The Value of Data Access and Management to Reduced Turnaround Time for Log Analysis

Petrophysics – The SPWLA Journal of Formation Evaluation and Reservoir Description, 2022

The oil and gas industry of today is undergoing rapid digitalization. This implies a massive effo... more The oil and gas industry of today is undergoing rapid digitalization. This implies a massive effort to transform standard work procedures and workflows into more efficient practices and implementations using machine learning (ML) and automation. This will enable geoscientists to explore and exploit vast amounts of data quickly and efficiently. To address these current industry challenges, we propose a pilot well-log database in HDF5 (Hierarchical Data Format version 5) format that can be continuously extended if new data become available. It also provides versatility for data preparation for further analysis. We show an alternative way to store and use log files in a hierarchical structure that is easy to understand and handle by research institutes, companies, and academia. We also touch upon well-log depth matching, a long-standing industry challenge, to synchronize data from different logging passes to a single depth reference. Having a robust automated solution for depth matchin...

Research paper thumbnail of Automation of depth matching using a structured well-log database: Prototype well example in the North Sea

Automation of depth matching using a structured well-log database: Prototype well example in the North Sea

SEG Technical Program Expanded Abstracts 2020, 2020

Research paper thumbnail of A retrospective look at cross model validation and its applicability in vibrational spectroscopy

A retrospective look at cross model validation and its applicability in vibrational spectroscopy

Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, Jul 1, 2021

In this paper, it is presented how Cross Model Validation (CMV), also known as double cross valid... more In this paper, it is presented how Cross Model Validation (CMV), also known as double cross validation, efficiently can be applied for variable selection in spectroscopic applications. The chosen applications are FT-IR spectroscopic measurements of mixtures of marzipan and NIR spectra of diesel fuels. Standard Normal Variate (SNV) is applied as a spectral pre-treatment to reduce baseline effects in the spectra for the FT-IR data whereas 2nd derivative was applied for the diesel fuels. Variable selection based on jack-knifing and frequency of significance from Cross Model Validation is employed for identifying non-relevant spectral regions as well as providing a relevant subset for model optimization. The results show a high degree of correspondence between the objectively found wavelength bands and the reported chemical interpretation found in the literature. In addition, the stability of the models due to conservative validation with respect to predictive performance is exemplified. Finally, an example of how the use of downweighing variables ensures optimal prediction ability and detailed model interpretation is shown.

Research paper thumbnail of Independent Component Analysis

Independent Component Analysis