Mohamed Aziz Bhouri | Massachusetts Institute of Technology (MIT) (original) (raw)

Papers by Mohamed Aziz Bhouri

Research paper thumbnail of Gaussian processes meet NeuralODEs: a Bayesian framework for learning the dynamics of partially observed systems from scarce and noisy data

Philosophical Transactions of the Royal Society A, Jun 20, 2022

This paper presents a machine learning framework (GP-NODE) for Bayesian systems identification fr... more This paper presents a machine learning framework (GP-NODE) for Bayesian systems identification from partial, noisy and irregular observations of nonlinear dynamical systems. The proposed method takes advantage of recent developments in differentiable programming to propagate gradient information through ordinary differential equation solvers and perform Bayesian inference with respect to unknown model parameters using Hamiltonian Monte Carlo sampling and Gaussian Process priors over the observed system states. This allows us to exploit temporal correlations in the observed data, and efficiently infer posterior distributions over plausible models with quantified uncertainty. Moreover, the use of sparsity-promoting priors such as the Finnish Horseshoe for free model parameters enables the discovery of interpretable and parsimonious representations for the underlying latent dynamics. A series of numerical studies is presented to demonstrate the effectiveness of the proposed GP-NODE method including predatorprey systems, systems biology, and a 50-dimensional human motion dynamical system. Taken together, our findings put forth a novel, flexible and robust workflow for data-driven model discovery under uncertainty.

Research paper thumbnail of Curved Beam Based Model for Piston-Ring Designs in Internal Combustion Engines: Closed Shape Within a Flexible Band, Free-Shape and Force in Circular Bore Study

SAE technical paper series, Apr 3, 2018

Characterizing the piston ring behavior is inherently associated with the oil consumption, fricti... more Characterizing the piston ring behavior is inherently associated with the oil consumption, friction, wear and blow-by in internal combustion engines. This behavior varies along the ring's circumference and determining these variations is of utmost importance for developing ring-packs achieving desired performances in terms of sealing and conformability. This study based on straight beam model was already developed but does not consider the lubrication sub-models, the tip gap effects and the characterization of the ring free shape based on any final closed shape. In this work, three numerical curved beam based models were developed to study the performance of the piston ring-pack. The conformability model was developed to characterize the behavior of the ring within the engine. In this model, the curved beam model is adopted with considering ring-bore and ring-groove interactions. This interactions include asperity and lubrication forces. Besides, gas forces are included to the model along with the inertia and initial ring tangential load. In this model we also allow for bore, groove upper and lower flanks thermal distortion. We also take into account the thermal expansion effect of the ring and the temperature gradient from inner diameter (ID) to outer diameter (OD) effects. The piston secondary motion and the variation of oil viscosity on the liner with its temperature in addition to the existence of fuel and the different hydrodynamic cases (Partially and fully flooded cases) are considered as well. This model revealed the ring position relative to the groove depending on the friction, inertia and gas pressures. It also characterizes the effect of non-uniform oil distribution on the liner and groove flanks. Finally, the ring gap position within a distorted bore also reveals the sealing performance of the ring. Using the curved beam model we also developed a module determining the twist calculation under fix ID or OD constraint. The static twist is an experimental characterization of the ring during which the user taps on the ring till there is a minimum clearance between the ring lowest point and the lower plate all over the ring's circumference but without any force contact. Our last model includes four sub-models that relate the ring free shape, its final shape when subjected to a constant radial pressure (this final shape is called ovality) and the force distribution in circular bore. Knowing one of these distribution, this model determines the other two. This tool is useful in the sense that the characterization of the ring is carried out by measuring its ovality which is more accurate than measuring its free shape or force distribution in circular bore. Thus, having a model that takes the ovality as an input is more convenient and useful based on the experiments carried out to characterize the ring.

Research paper thumbnail of A two-level parameterized Model-Order Reduction approach for time-domain elastodynamics

Computer Methods in Applied Mechanics and Engineering, Nov 1, 2021

We present a two-level parameterized Model Order Reduction (pMOR) technique for the linear hyperb... more We present a two-level parameterized Model Order Reduction (pMOR) technique for the linear hyperbolic Partial Differential Equation (PDE) of time-domain elastodynamics. In order to approximate the frequency-domain PDE, we take advantage of the Port-Reduced Reduced-Basis Component (PR-RBC) method to develop (in the offline stage) reduced bases for subdomains; the latter are then assembled (in the online stage) to form the global domains of interest. The PR-RBC approach reduces the effective dimensionality of the parameter space and also provides flexibility in topology and geometry. In the online stage, for each query, we consider a given parameter value and associated global domain. In the first level of reduction, the PR-RBC reduced bases are used to approximate the frequency-domain solution at selected frequencies. In the second level of reduction, these instantiated PR-RBC approximations are used as surrogate truth solutions in a Strong Greedy approach to identify a reduced basis space; the PDE of timedomain elastodynamics is then projected on this reduced space. We provide a numerical example to demonstrate the computational capability and assess the performance of the proposed two-level approach.

Research paper thumbnail of Curved Beam Based Model for Piston-Ring Designs in Internal Combustion Engines: Working Engine Conditions Study

SAE technical paper series, Apr 3, 2018

Research paper thumbnail of Bayesian differential programming for robust systems identification under uncertainty

Proceedings of The Royal Society A: Mathematical, Physical and Engineering Sciences, Nov 1, 2020

This paper presents a machine learning framework for Bayesian systems identification from noisy, ... more This paper presents a machine learning framework for Bayesian systems identification from noisy, sparse and irregular observations of nonlinear dynamical systems. The proposed method takes advantage of recent developments in differentiable programming to propagate gradient information through ordinary differential equation solvers and perform Bayesian inference with respect to unknown model parameters using Hamiltonian Monte Carlo. This allows us to efficiently infer posterior distributions over plausible models with quantified uncertainty, while the use of sparsitypromoting priors enables the discovery of interpretable and parsimonious representations for the underlying latent dynamics. A series of numerical studies is presented to demonstrate the effectiveness of the proposed methods including nonlinear oscillators, predator-prey systems, chaotic dynamics and systems biology. Taken all together, our findings put forth a novel, flexible and robust workflow for data-driven model discovery under uncertainty. All codes and data accompanying this manuscript are available at https://github.com/PredictiveIntelligenceLab/ BayesianDifferentiableProgramming.

Research paper thumbnail of History-Based, Bayesian, Closure for Stochastic Parameterization: Application to Lorenz '96

arXiv (Cornell University), Oct 26, 2022

Physical parameterizations are used as representations of unresolved subgrid processes within wea... more Physical parameterizations are used as representations of unresolved subgrid processes within weather and global climate models or coarse-scale turbulent models, whose resolutions are too coarse to resolve small-scale processes. These parameterizations are typically grounded on physically-based, yet empirical, representations of the underlying small-scale processes. Machine learning-based parameterizations have recently been proposed as an alternative and have shown great promises to reduce uncertainties associated with small-scale processes. Yet, those approaches still show some important mismatches that are often attributed to stochasticity in the considered process. This stochasticity can be due to noisy data, unresolved variables or simply to the inherent chaotic nature of the process. To address these issues, we develop a new type of parameterization (closure) which is based on a Bayesian formalism for neural networks, to account for uncertainty quantification, and includes memory, to account for the non-instantaneous response of the closure. To overcome the curse of dimensionality of Bayesian techniques in high-dimensional spaces, the Bayesian strategy is based on a Hamiltonian Monte Carlo Markov Chain sampling strategy that takes advantage of the likelihood function and kinetic energy's gradients with respect to the parameters to accelerate the sampling process. We apply the proposed Bayesian history-based parameterization to the Lorenz '96 model in the presence of noisy and sparse data, similar to satellite observations, and show its capacity to predict skillful forecasts of the resolved variables while returning trustworthy uncertainty quantifications for different sources of error. This approach paves the way for the use of Bayesian approaches for closure problems. Keywords Stochastic Parameterization • Bayesian Surrogate Models • Uncertainty Quantification • Online Testing • Chaotic Dynamical System • Hamiltonian Monte Carlo Markov Chain • Neural Networks Lead Paragraph Climate models involve physical processes with different scales. Given the available computational resources, small-scale processes are not resolved in global climate models but rather represented by parameterization schemes. Machine learning has been recently used to improve existing parameterization approaches, yet those methods still show some important mismatches that are often attributed to stochasticity in the considered processes. This stochasticity can be due to noisy data, unresolved physical variables or simply to the inherent chaotic nature of the process. In this work, we develop a probabilistic parameterization scheme capable of predicting skillful forecasts of the resolved physical variables while returning trustworthy uncertainty quan

Research paper thumbnail of Gaussian processes meet NeuralODEs: A Bayesian framework for learning the dynamics of partially observed systems from scarce and noisy data

arXiv (Cornell University), Mar 4, 2021

This paper presents a machine learning framework (GP-NODE) for Bayesian systems identification fr... more This paper presents a machine learning framework (GP-NODE) for Bayesian systems identification from partial, noisy and irregular observations of nonlinear dynamical systems. The proposed method takes advantage of recent developments in differentiable programming to propagate gradient information through ordinary differential equation solvers and perform Bayesian inference with respect to unknown model parameters using Hamiltonian Monte Carlo sampling and Gaussian Process priors over the observed system states. This allows us to exploit temporal correlations in the observed data, and efficiently infer posterior distributions over plausible models with quantified uncertainty. Moreover, the use of sparsity-promoting priors such as the Finnish Horseshoe for free model parameters enables the discovery of interpretable and parsimonious representations for the underlying latent dynamics. A series of numerical studies is presented to demonstrate the effectiveness of the proposed GP-NODE method including predatorprey systems, systems biology, and a 50-dimensional human motion dynamical system. Taken together, our findings put forth a novel, flexible and robust workflow for data-driven model discovery under uncertainty.

Research paper thumbnail of COVID-19 dynamics across the US: A deep learning study of human mobility and social behavior

medRxiv (Cold Spring Harbor Laboratory), Sep 23, 2020

Research paper thumbnail of Memory-based parameterization with differentiable solver: Application to Lorenz ’96

Chaos, Jul 1, 2023

Physical parameterizations (or closures) are used as representations of unresolved subgrid proces... more Physical parameterizations (or closures) are used as representations of unresolved subgrid processes within weather and global climate models or coarse-scale turbulent models, whose resolutions are too coarse to resolve small-scale processes. These parameterizations are typically grounded on physically based, yet empirical, representations of the underlying small-scale processes. Machine learning-based parameterizations have recently been proposed as an alternative solution and have shown great promise to reduce uncertainties associated with the parameterization of small-scale processes. Yet, those approaches still show some important mismatches that are often attributed to the stochasticity of the considered process. This stochasticity can be due to coarse temporal resolution, unresolved variables, or simply to the inherent chaotic nature of the process. To address these issues, we propose a new type of parameterization (closure), which is built using memory-based neural networks, to account for the non-instantaneous response of the closure and to enhance its stability and prediction accuracy. We apply the proposed memory-based parameterization, with differentiable solver, to the Lorenz ’96 model in the presence of a coarse temporal resolution and show its capacity to predict skillful forecasts over a long time horizon of the resolved variables compared to instantaneous parameterizations. This approach paves the way for the use of memory-based parameterizations for closure problems.

Research paper thumbnail of Model-Order-Reduction Approach for Structural Health Monitoring of Large Deployed Structures With Localized Operational Excitations

We present a simulation-based classification approach for large deployed structures with localize... more We present a simulation-based classification approach for large deployed structures with localized operational excitations. The method extends the two-level Port-Reduced Reduced-Basis Component (PR-RBC) technique to provide faster solution estimation to the hyperbolic partial differential equation of time-domain elastodynamics with a moving load. Time-domain correlation function-based features are built in order to train classifiers such as artificial neural networks and perform damage detection. The method is tested on a bridge example with a moving vehicle (playing the role of a digital twin) in order to detect cracks' existence. Such problem has 45 parameters and shows the merits of the two-level PR-RBC approach and of the correlation function-based features in the context of operational excitations, other nuisance parameters and added noise. The quality of the classification task is enhanced by the sufficiently large synthetic training dataset and the accuracy of the numerical solutions, reaching test classification errors below 0.1% for disjoint training set of size 7 × 10 3 and test set of size 3 × 10 3 .

Research paper thumbnail of A Certified Two-Step Port-Reduced Reduced-Basis Component Method for Wave Equation and Time Domain Elastodynamic PDE

arXiv (Cornell University), Feb 25, 2020

We present a two-level parameterized Model Order Reduction (pMOR) technique for the linear hyperb... more We present a two-level parameterized Model Order Reduction (pMOR) technique for the linear hyperbolic Partial Differential Equation (PDE) of time-domain elastodynamics. In order to approximate the frequency-domain PDE, we take advantage of the Port-Reduced Reduced-Basis Component (PR-RBC) method to develop (in the offline stage) reduced bases for subdomains; the latter are then assembled (in the online stage) to form the global domains of interest. The PR-RBC approach reduces the effective dimensionality of the parameter space and also provides flexibility in topology and geometry. In the online stage, for each query, we consider a given parameter value and associated global domain. In the first level of reduction, the PR-RBC reduced bases are used to approximate the frequency-domain solution at selected frequencies. In the second level of reduction, these instantiated PR-RBC approximations are used as surrogate truth solutions in a Strong Greedy approach to identify a reduced basis space; the PDE of time-domain elastodynamics is then projected on this reduced space. We provide a numerical example to demonstrate the computational capability and assess the performance of the proposed two-level approach.

Research paper thumbnail of A two-step port-reduced reduced-basis component method for time domain elastodynamic PDE with application to structural health monitoring

Research paper thumbnail of Curved beam based model for piston-ring designs in internal combustion engines

Characterizing the piston ring behavior is inherently associated with the oil consumption, fricti... more Characterizing the piston ring behavior is inherently associated with the oil consumption, friction, wear and blow-by in internal combustion engines. This behavior varies along the ring's circumference and determining these variations is of utmost importance for developing ring-packs achieving desired performances in terms of sealing and conformability. This study based on straight beam model was already developed but does not consider the lubrication sub-models, the tip gap effects and the characterization of the ring free shape based on any final closed shape. In this work, three numerical curved beam based models were developed to study the performance of the piston ring-pack. The conformability model was developed to characterize the behavior of the ring within the engine. In this model, the curved beam model is adopted with considering ring-bore and ring-groove interactions. This interactions include asperity and lubrication forces. Besides, gas forces are included to the model along with the inertia and initial ring tangential load. In this model we also allow for bore, groove upper and lower flanks thermal distortion. We also take into account the thermal expansion effect of the ring and the temperature gradient from inner diameter (ID) to outer diameter (OD) effects. The piston secondary motion and the variation of oil viscosity on the liner with its temperature in addition to the existence of fuel and the different hydrodynamic cases (Partially and fully flooded cases) are considered as well. This model revealed the ring position relative to the groove depending on the friction, inertia and gas pressures. It also characterizes the effect of non-uniform oil distribution on the liner and groove flanks. Finally, the ring gap position within a distorted bore also reveals the sealing performance of the ring. Using the curved beam model we also developed a module determining the twist calculation under fix ID or OD constraint. The static twist is an experimental characterization of the ring during which the user taps on the ring till there is a minimum clearance between the ring lowest point and the lower plate all over the ring's circumference but without any force contact. Our last model includes four sub-models that relate the ring free shape, its final shape when subjected to a constant radial pressure (this final shape is called ovality) and the force distribution in circular bore. Knowing one of these distribution, this model determines the other two. This tool is useful in the sense that the characterization of the ring is carried out by measuring its ovality which is more accurate than measuring its free shape or force distribution in circular bore. Thus, having a model that takes the ovality as an input is more convenient and useful based on the experiments carried out to characterize the ring.

Research paper thumbnail of Scalable Bayesian optimization with high-dimensional outputs using randomized prior networks

arXiv (Cornell University), Feb 14, 2023

Several fundamental problems in science and engineering consist of global optimization tasks invo... more Several fundamental problems in science and engineering consist of global optimization tasks involving unknown high-dimensional (black-box) functions that map a set of controllable variables to the outcomes of an expensive experiment. Bayesian Optimization (BO) techniques are known to be effective in tackling global optimization problems using a relatively small number objective function evaluations, but their performance suffers when dealing with high-dimensional outputs. To overcome the major challenge of dimensionality, here we propose a deep learning framework for BO and sequential decision making based on bootstrapped ensembles of neural architectures with randomized priors. Using appropriate architecture choices, we show that the proposed framework can approximate functional relationships between design variables and quantities of interest, even in cases where the latter take values in high-dimensional vector spaces or even infinite-dimensional function spaces. In the context of BO, we augmented the proposed probabilistic surrogates with re-parameterized Monte Carlo approximations of multiple-point (parallel) acquisition functions, as well as methodological extensions for accommodating black-box constraints and multi-fidelity information sources. We test the proposed framework against state-of-the-art methods for BO and demonstrate superior performance across several challenging tasks with high-dimensional outputs, including a constrained multi-fidelity optimization task involving shape optimization of rotor blades in turbo-machinery. Highlights • Development of a bootstrapped Randomized Prior Network (RPN) approach for Bayesian Optimization (BO). • Extension of the proposed RPN-BO framework to the most general case of constrained multi-fidelity optimization. • Formulation of appropriate re-parametrizations for evaluating common acquisition functions via Monte Carlo approximation, including parallel multi-point selection criteria for constrained and multi-fidelity optimization. • Test of the proposed RPN-BO approach against state-of-the-art methods and demonstration of its superior performance across several challenging BO tasks with high-dimensional outputs.

Research paper thumbnail of A Two-Level Parameterized Model-Order Reduction Approach for Time-Domain Elastodynamics

arXiv (Cornell University), Feb 25, 2020

We present a two-level parameterized Model Order Reduction (pMOR) technique for the linear hyperb... more We present a two-level parameterized Model Order Reduction (pMOR) technique for the linear hyperbolic Partial Differential Equation (PDE) of time-domain elastodynamics. In order to approximate the frequency-domain PDE, we take advantage of the Port-Reduced Reduced-Basis Component (PR-RBC) method to develop (in the offline stage) reduced bases for subdomains; the latter are then assembled (in the online stage) to form the global domains of interest. The PR-RBC approach reduces the effective dimensionality of the parameter space and also provides flexibility in topology and geometry. In the online stage, for each query, we consider a given parameter value and associated global domain. In the first level of reduction, the PR-RBC reduced bases are used to approximate the frequency-domain solution at selected frequencies. In the second level of reduction, these instantiated PR-RBC approximations are used as surrogate truth solutions in a Strong Greedy approach to identify a reduced basis space; the PDE of timedomain elastodynamics is then projected on this reduced space. We provide a numerical example to demonstrate the computational capability and assess the performance of the proposed two-level approach.

Research paper thumbnail of ClimSim: An open large-scale dataset for training high-resolution physics emulators in hybrid multi-scale climate simulators

arXiv (Cornell University), Jun 14, 2023

Modern climate projections lack adequate spatial and temporal resolution due to computational con... more Modern climate projections lack adequate spatial and temporal resolution due to computational constraints. A consequence is inaccurate and imprecise predictions of critical processes such as storms. Hybrid methods that combine physics with machine learning (ML) have introduced a new generation of higher fidelity climate simulators that can sidestep Moore's Law by outsourcing compute-hungry, short, high-resolution simulations to ML emulators. However, this hybrid ML-physics simulation approach requires domain-specific treatment and has been inaccessible to ML experts because of lack of training data and relevant, easy-to-use workflows. We present ClimSim, the largest-ever dataset designed for hybrid ML-physics research. It comprises multi-scale climate simulations, developed by a consortium of climate scientists and ML researchers. It consists of 5.7 billion pairs of multivariate input and output vectors that isolate the influence of locally-nested, high-resolution, high-fidelity physics on a host climate simulator's macro-scale physical state. The dataset is global in coverage, spans multiple years at high sampling frequency, and is designed such that resulting emulators are compatible with downstream coupling into operational climate simulators. We implement a range of deterministic and stochastic regression baselines to highlight the ML challenges and their scoring.

Research paper thumbnail of Memory-based parameterization with differentiable solver: Application to Lorenz ’96

Chaos: An Interdisciplinary Journal of Nonlinear Science

Physical parameterizations (or closures) are used as representations of unresolved subgrid proces... more Physical parameterizations (or closures) are used as representations of unresolved subgrid processes within weather and global climate models or coarse-scale turbulent models, whose resolutions are too coarse to resolve small-scale processes. These parameterizations are typically grounded on physically based, yet empirical, representations of the underlying small-scale processes. Machine learning-based parameterizations have recently been proposed as an alternative solution and have shown great promise to reduce uncertainties associated with the parameterization of small-scale processes. Yet, those approaches still show some important mismatches that are often attributed to the stochasticity of the considered process. This stochasticity can be due to coarse temporal resolution, unresolved variables, or simply to the inherent chaotic nature of the process. To address these issues, we propose a new type of parameterization (closure), which is built using memory-based neural networks, ...

Research paper thumbnail of Scalable Bayesian optimization with high-dimensional outputs using randomized prior networks

arXiv (Cornell University), Feb 14, 2023

Several fundamental problems in science and engineering consist of global optimization tasks invo... more Several fundamental problems in science and engineering consist of global optimization tasks involving unknown high-dimensional (black-box) functions that map a set of controllable variables to the outcomes of an expensive experiment. Bayesian Optimization (BO) techniques are known to be effective in tackling global optimization problems using a relatively small number objective function evaluations, but their performance suffers when dealing with high-dimensional outputs. To overcome the major challenge of dimensionality, here we propose a deep learning framework for BO and sequential decision making based on bootstrapped ensembles of neural architectures with randomized priors. Using appropriate architecture choices, we show that the proposed framework can approximate functional relationships between design variables and quantities of interest, even in cases where the latter take values in high-dimensional vector spaces or even infinite-dimensional function spaces. In the context of BO, we augmented the proposed probabilistic surrogates with reparameterized Monte Carlo approximations of multiple-point (parallel) acquisition functions, as well as methodological extensions for accommodating black-box constraints and multi-fidelity information sources. We test the proposed framework against state-of-the-art methods for BO and demonstrate superior performance across several challenging tasks with high-dimensional outputs, including a constrained optimization task involving shape optimization of rotor blades in turbo-machinery.

Research paper thumbnail of A two-step port-reduced reduced-basis component method for time domain elastodynamic PDE with application to structural health monitoring

Research paper thumbnail of History-Based, Bayesian, Closure for Stochastic Parameterization: Application to Lorenz '96

Cornell University - arXiv, Oct 26, 2022

Physical parameterizations are used as representations of unresolved subgrid processes within wea... more Physical parameterizations are used as representations of unresolved subgrid processes within weather and global climate models or coarse-scale turbulent models, whose resolutions are too coarse to resolve small-scale processes. These parameterizations are typically grounded on physically-based, yet empirical, representations of the underlying small-scale processes. Machine learning-based parameterizations have recently been proposed as an alternative and have shown great promises to reduce uncertainties associated with small-scale processes. Yet, those approaches still show some important mismatches that are often attributed to stochasticity in the considered process. This stochasticity can be due to noisy data, unresolved variables or simply to the inherent chaotic nature of the process. To address these issues, we develop a new type of parameterization (closure) which is based on a Bayesian formalism for neural networks, to account for uncertainty quantification, and includes memory, to account for the non-instantaneous response of the closure. To overcome the curse of dimensionality of Bayesian techniques in high-dimensional spaces, the Bayesian strategy is based on a Hamiltonian Monte Carlo Markov Chain sampling strategy that takes advantage of the likelihood function and kinetic energy's gradients with respect to the parameters to accelerate the sampling process. We apply the proposed Bayesian history-based parameterization to the Lorenz '96 model in the presence of noisy and sparse data, similar to satellite observations, and show its capacity to predict skillful forecasts of the resolved variables while returning trustworthy uncertainty quantifications for different sources of error. This approach paves the way for the use of Bayesian approaches for closure problems. Keywords Stochastic Parameterization • Bayesian Surrogate Models • Uncertainty Quantification • Online Testing • Chaotic Dynamical System • Hamiltonian Monte Carlo Markov Chain • Neural Networks Lead Paragraph Climate models involve physical processes with different scales. Given the available computational resources, small-scale processes are not resolved in global climate models but rather represented by parameterization schemes. Machine learning has been recently used to improve existing parameterization approaches, yet those methods still show some important mismatches that are often attributed to stochasticity in the considered processes. This stochasticity can be due to noisy data, unresolved physical variables or simply to the inherent chaotic nature of the process. In this work, we develop a probabilistic parameterization scheme capable of predicting skillful forecasts of the resolved physical variables while returning trustworthy uncertainty quan

Research paper thumbnail of Gaussian processes meet NeuralODEs: a Bayesian framework for learning the dynamics of partially observed systems from scarce and noisy data

Philosophical Transactions of the Royal Society A, Jun 20, 2022

This paper presents a machine learning framework (GP-NODE) for Bayesian systems identification fr... more This paper presents a machine learning framework (GP-NODE) for Bayesian systems identification from partial, noisy and irregular observations of nonlinear dynamical systems. The proposed method takes advantage of recent developments in differentiable programming to propagate gradient information through ordinary differential equation solvers and perform Bayesian inference with respect to unknown model parameters using Hamiltonian Monte Carlo sampling and Gaussian Process priors over the observed system states. This allows us to exploit temporal correlations in the observed data, and efficiently infer posterior distributions over plausible models with quantified uncertainty. Moreover, the use of sparsity-promoting priors such as the Finnish Horseshoe for free model parameters enables the discovery of interpretable and parsimonious representations for the underlying latent dynamics. A series of numerical studies is presented to demonstrate the effectiveness of the proposed GP-NODE method including predatorprey systems, systems biology, and a 50-dimensional human motion dynamical system. Taken together, our findings put forth a novel, flexible and robust workflow for data-driven model discovery under uncertainty.

Research paper thumbnail of Curved Beam Based Model for Piston-Ring Designs in Internal Combustion Engines: Closed Shape Within a Flexible Band, Free-Shape and Force in Circular Bore Study

SAE technical paper series, Apr 3, 2018

Characterizing the piston ring behavior is inherently associated with the oil consumption, fricti... more Characterizing the piston ring behavior is inherently associated with the oil consumption, friction, wear and blow-by in internal combustion engines. This behavior varies along the ring's circumference and determining these variations is of utmost importance for developing ring-packs achieving desired performances in terms of sealing and conformability. This study based on straight beam model was already developed but does not consider the lubrication sub-models, the tip gap effects and the characterization of the ring free shape based on any final closed shape. In this work, three numerical curved beam based models were developed to study the performance of the piston ring-pack. The conformability model was developed to characterize the behavior of the ring within the engine. In this model, the curved beam model is adopted with considering ring-bore and ring-groove interactions. This interactions include asperity and lubrication forces. Besides, gas forces are included to the model along with the inertia and initial ring tangential load. In this model we also allow for bore, groove upper and lower flanks thermal distortion. We also take into account the thermal expansion effect of the ring and the temperature gradient from inner diameter (ID) to outer diameter (OD) effects. The piston secondary motion and the variation of oil viscosity on the liner with its temperature in addition to the existence of fuel and the different hydrodynamic cases (Partially and fully flooded cases) are considered as well. This model revealed the ring position relative to the groove depending on the friction, inertia and gas pressures. It also characterizes the effect of non-uniform oil distribution on the liner and groove flanks. Finally, the ring gap position within a distorted bore also reveals the sealing performance of the ring. Using the curved beam model we also developed a module determining the twist calculation under fix ID or OD constraint. The static twist is an experimental characterization of the ring during which the user taps on the ring till there is a minimum clearance between the ring lowest point and the lower plate all over the ring's circumference but without any force contact. Our last model includes four sub-models that relate the ring free shape, its final shape when subjected to a constant radial pressure (this final shape is called ovality) and the force distribution in circular bore. Knowing one of these distribution, this model determines the other two. This tool is useful in the sense that the characterization of the ring is carried out by measuring its ovality which is more accurate than measuring its free shape or force distribution in circular bore. Thus, having a model that takes the ovality as an input is more convenient and useful based on the experiments carried out to characterize the ring.

Research paper thumbnail of A two-level parameterized Model-Order Reduction approach for time-domain elastodynamics

Computer Methods in Applied Mechanics and Engineering, Nov 1, 2021

We present a two-level parameterized Model Order Reduction (pMOR) technique for the linear hyperb... more We present a two-level parameterized Model Order Reduction (pMOR) technique for the linear hyperbolic Partial Differential Equation (PDE) of time-domain elastodynamics. In order to approximate the frequency-domain PDE, we take advantage of the Port-Reduced Reduced-Basis Component (PR-RBC) method to develop (in the offline stage) reduced bases for subdomains; the latter are then assembled (in the online stage) to form the global domains of interest. The PR-RBC approach reduces the effective dimensionality of the parameter space and also provides flexibility in topology and geometry. In the online stage, for each query, we consider a given parameter value and associated global domain. In the first level of reduction, the PR-RBC reduced bases are used to approximate the frequency-domain solution at selected frequencies. In the second level of reduction, these instantiated PR-RBC approximations are used as surrogate truth solutions in a Strong Greedy approach to identify a reduced basis space; the PDE of timedomain elastodynamics is then projected on this reduced space. We provide a numerical example to demonstrate the computational capability and assess the performance of the proposed two-level approach.

Research paper thumbnail of Curved Beam Based Model for Piston-Ring Designs in Internal Combustion Engines: Working Engine Conditions Study

SAE technical paper series, Apr 3, 2018

Research paper thumbnail of Bayesian differential programming for robust systems identification under uncertainty

Proceedings of The Royal Society A: Mathematical, Physical and Engineering Sciences, Nov 1, 2020

This paper presents a machine learning framework for Bayesian systems identification from noisy, ... more This paper presents a machine learning framework for Bayesian systems identification from noisy, sparse and irregular observations of nonlinear dynamical systems. The proposed method takes advantage of recent developments in differentiable programming to propagate gradient information through ordinary differential equation solvers and perform Bayesian inference with respect to unknown model parameters using Hamiltonian Monte Carlo. This allows us to efficiently infer posterior distributions over plausible models with quantified uncertainty, while the use of sparsitypromoting priors enables the discovery of interpretable and parsimonious representations for the underlying latent dynamics. A series of numerical studies is presented to demonstrate the effectiveness of the proposed methods including nonlinear oscillators, predator-prey systems, chaotic dynamics and systems biology. Taken all together, our findings put forth a novel, flexible and robust workflow for data-driven model discovery under uncertainty. All codes and data accompanying this manuscript are available at https://github.com/PredictiveIntelligenceLab/ BayesianDifferentiableProgramming.

Research paper thumbnail of History-Based, Bayesian, Closure for Stochastic Parameterization: Application to Lorenz '96

arXiv (Cornell University), Oct 26, 2022

Physical parameterizations are used as representations of unresolved subgrid processes within wea... more Physical parameterizations are used as representations of unresolved subgrid processes within weather and global climate models or coarse-scale turbulent models, whose resolutions are too coarse to resolve small-scale processes. These parameterizations are typically grounded on physically-based, yet empirical, representations of the underlying small-scale processes. Machine learning-based parameterizations have recently been proposed as an alternative and have shown great promises to reduce uncertainties associated with small-scale processes. Yet, those approaches still show some important mismatches that are often attributed to stochasticity in the considered process. This stochasticity can be due to noisy data, unresolved variables or simply to the inherent chaotic nature of the process. To address these issues, we develop a new type of parameterization (closure) which is based on a Bayesian formalism for neural networks, to account for uncertainty quantification, and includes memory, to account for the non-instantaneous response of the closure. To overcome the curse of dimensionality of Bayesian techniques in high-dimensional spaces, the Bayesian strategy is based on a Hamiltonian Monte Carlo Markov Chain sampling strategy that takes advantage of the likelihood function and kinetic energy's gradients with respect to the parameters to accelerate the sampling process. We apply the proposed Bayesian history-based parameterization to the Lorenz '96 model in the presence of noisy and sparse data, similar to satellite observations, and show its capacity to predict skillful forecasts of the resolved variables while returning trustworthy uncertainty quantifications for different sources of error. This approach paves the way for the use of Bayesian approaches for closure problems. Keywords Stochastic Parameterization • Bayesian Surrogate Models • Uncertainty Quantification • Online Testing • Chaotic Dynamical System • Hamiltonian Monte Carlo Markov Chain • Neural Networks Lead Paragraph Climate models involve physical processes with different scales. Given the available computational resources, small-scale processes are not resolved in global climate models but rather represented by parameterization schemes. Machine learning has been recently used to improve existing parameterization approaches, yet those methods still show some important mismatches that are often attributed to stochasticity in the considered processes. This stochasticity can be due to noisy data, unresolved physical variables or simply to the inherent chaotic nature of the process. In this work, we develop a probabilistic parameterization scheme capable of predicting skillful forecasts of the resolved physical variables while returning trustworthy uncertainty quan

Research paper thumbnail of Gaussian processes meet NeuralODEs: A Bayesian framework for learning the dynamics of partially observed systems from scarce and noisy data

arXiv (Cornell University), Mar 4, 2021

This paper presents a machine learning framework (GP-NODE) for Bayesian systems identification fr... more This paper presents a machine learning framework (GP-NODE) for Bayesian systems identification from partial, noisy and irregular observations of nonlinear dynamical systems. The proposed method takes advantage of recent developments in differentiable programming to propagate gradient information through ordinary differential equation solvers and perform Bayesian inference with respect to unknown model parameters using Hamiltonian Monte Carlo sampling and Gaussian Process priors over the observed system states. This allows us to exploit temporal correlations in the observed data, and efficiently infer posterior distributions over plausible models with quantified uncertainty. Moreover, the use of sparsity-promoting priors such as the Finnish Horseshoe for free model parameters enables the discovery of interpretable and parsimonious representations for the underlying latent dynamics. A series of numerical studies is presented to demonstrate the effectiveness of the proposed GP-NODE method including predatorprey systems, systems biology, and a 50-dimensional human motion dynamical system. Taken together, our findings put forth a novel, flexible and robust workflow for data-driven model discovery under uncertainty.

Research paper thumbnail of COVID-19 dynamics across the US: A deep learning study of human mobility and social behavior

medRxiv (Cold Spring Harbor Laboratory), Sep 23, 2020

Research paper thumbnail of Memory-based parameterization with differentiable solver: Application to Lorenz ’96

Chaos, Jul 1, 2023

Physical parameterizations (or closures) are used as representations of unresolved subgrid proces... more Physical parameterizations (or closures) are used as representations of unresolved subgrid processes within weather and global climate models or coarse-scale turbulent models, whose resolutions are too coarse to resolve small-scale processes. These parameterizations are typically grounded on physically based, yet empirical, representations of the underlying small-scale processes. Machine learning-based parameterizations have recently been proposed as an alternative solution and have shown great promise to reduce uncertainties associated with the parameterization of small-scale processes. Yet, those approaches still show some important mismatches that are often attributed to the stochasticity of the considered process. This stochasticity can be due to coarse temporal resolution, unresolved variables, or simply to the inherent chaotic nature of the process. To address these issues, we propose a new type of parameterization (closure), which is built using memory-based neural networks, to account for the non-instantaneous response of the closure and to enhance its stability and prediction accuracy. We apply the proposed memory-based parameterization, with differentiable solver, to the Lorenz ’96 model in the presence of a coarse temporal resolution and show its capacity to predict skillful forecasts over a long time horizon of the resolved variables compared to instantaneous parameterizations. This approach paves the way for the use of memory-based parameterizations for closure problems.

Research paper thumbnail of Model-Order-Reduction Approach for Structural Health Monitoring of Large Deployed Structures With Localized Operational Excitations

We present a simulation-based classification approach for large deployed structures with localize... more We present a simulation-based classification approach for large deployed structures with localized operational excitations. The method extends the two-level Port-Reduced Reduced-Basis Component (PR-RBC) technique to provide faster solution estimation to the hyperbolic partial differential equation of time-domain elastodynamics with a moving load. Time-domain correlation function-based features are built in order to train classifiers such as artificial neural networks and perform damage detection. The method is tested on a bridge example with a moving vehicle (playing the role of a digital twin) in order to detect cracks' existence. Such problem has 45 parameters and shows the merits of the two-level PR-RBC approach and of the correlation function-based features in the context of operational excitations, other nuisance parameters and added noise. The quality of the classification task is enhanced by the sufficiently large synthetic training dataset and the accuracy of the numerical solutions, reaching test classification errors below 0.1% for disjoint training set of size 7 × 10 3 and test set of size 3 × 10 3 .

Research paper thumbnail of A Certified Two-Step Port-Reduced Reduced-Basis Component Method for Wave Equation and Time Domain Elastodynamic PDE

arXiv (Cornell University), Feb 25, 2020

We present a two-level parameterized Model Order Reduction (pMOR) technique for the linear hyperb... more We present a two-level parameterized Model Order Reduction (pMOR) technique for the linear hyperbolic Partial Differential Equation (PDE) of time-domain elastodynamics. In order to approximate the frequency-domain PDE, we take advantage of the Port-Reduced Reduced-Basis Component (PR-RBC) method to develop (in the offline stage) reduced bases for subdomains; the latter are then assembled (in the online stage) to form the global domains of interest. The PR-RBC approach reduces the effective dimensionality of the parameter space and also provides flexibility in topology and geometry. In the online stage, for each query, we consider a given parameter value and associated global domain. In the first level of reduction, the PR-RBC reduced bases are used to approximate the frequency-domain solution at selected frequencies. In the second level of reduction, these instantiated PR-RBC approximations are used as surrogate truth solutions in a Strong Greedy approach to identify a reduced basis space; the PDE of time-domain elastodynamics is then projected on this reduced space. We provide a numerical example to demonstrate the computational capability and assess the performance of the proposed two-level approach.

Research paper thumbnail of A two-step port-reduced reduced-basis component method for time domain elastodynamic PDE with application to structural health monitoring

Research paper thumbnail of Curved beam based model for piston-ring designs in internal combustion engines

Characterizing the piston ring behavior is inherently associated with the oil consumption, fricti... more Characterizing the piston ring behavior is inherently associated with the oil consumption, friction, wear and blow-by in internal combustion engines. This behavior varies along the ring's circumference and determining these variations is of utmost importance for developing ring-packs achieving desired performances in terms of sealing and conformability. This study based on straight beam model was already developed but does not consider the lubrication sub-models, the tip gap effects and the characterization of the ring free shape based on any final closed shape. In this work, three numerical curved beam based models were developed to study the performance of the piston ring-pack. The conformability model was developed to characterize the behavior of the ring within the engine. In this model, the curved beam model is adopted with considering ring-bore and ring-groove interactions. This interactions include asperity and lubrication forces. Besides, gas forces are included to the model along with the inertia and initial ring tangential load. In this model we also allow for bore, groove upper and lower flanks thermal distortion. We also take into account the thermal expansion effect of the ring and the temperature gradient from inner diameter (ID) to outer diameter (OD) effects. The piston secondary motion and the variation of oil viscosity on the liner with its temperature in addition to the existence of fuel and the different hydrodynamic cases (Partially and fully flooded cases) are considered as well. This model revealed the ring position relative to the groove depending on the friction, inertia and gas pressures. It also characterizes the effect of non-uniform oil distribution on the liner and groove flanks. Finally, the ring gap position within a distorted bore also reveals the sealing performance of the ring. Using the curved beam model we also developed a module determining the twist calculation under fix ID or OD constraint. The static twist is an experimental characterization of the ring during which the user taps on the ring till there is a minimum clearance between the ring lowest point and the lower plate all over the ring's circumference but without any force contact. Our last model includes four sub-models that relate the ring free shape, its final shape when subjected to a constant radial pressure (this final shape is called ovality) and the force distribution in circular bore. Knowing one of these distribution, this model determines the other two. This tool is useful in the sense that the characterization of the ring is carried out by measuring its ovality which is more accurate than measuring its free shape or force distribution in circular bore. Thus, having a model that takes the ovality as an input is more convenient and useful based on the experiments carried out to characterize the ring.

Research paper thumbnail of Scalable Bayesian optimization with high-dimensional outputs using randomized prior networks

arXiv (Cornell University), Feb 14, 2023

Several fundamental problems in science and engineering consist of global optimization tasks invo... more Several fundamental problems in science and engineering consist of global optimization tasks involving unknown high-dimensional (black-box) functions that map a set of controllable variables to the outcomes of an expensive experiment. Bayesian Optimization (BO) techniques are known to be effective in tackling global optimization problems using a relatively small number objective function evaluations, but their performance suffers when dealing with high-dimensional outputs. To overcome the major challenge of dimensionality, here we propose a deep learning framework for BO and sequential decision making based on bootstrapped ensembles of neural architectures with randomized priors. Using appropriate architecture choices, we show that the proposed framework can approximate functional relationships between design variables and quantities of interest, even in cases where the latter take values in high-dimensional vector spaces or even infinite-dimensional function spaces. In the context of BO, we augmented the proposed probabilistic surrogates with re-parameterized Monte Carlo approximations of multiple-point (parallel) acquisition functions, as well as methodological extensions for accommodating black-box constraints and multi-fidelity information sources. We test the proposed framework against state-of-the-art methods for BO and demonstrate superior performance across several challenging tasks with high-dimensional outputs, including a constrained multi-fidelity optimization task involving shape optimization of rotor blades in turbo-machinery. Highlights • Development of a bootstrapped Randomized Prior Network (RPN) approach for Bayesian Optimization (BO). • Extension of the proposed RPN-BO framework to the most general case of constrained multi-fidelity optimization. • Formulation of appropriate re-parametrizations for evaluating common acquisition functions via Monte Carlo approximation, including parallel multi-point selection criteria for constrained and multi-fidelity optimization. • Test of the proposed RPN-BO approach against state-of-the-art methods and demonstration of its superior performance across several challenging BO tasks with high-dimensional outputs.

Research paper thumbnail of A Two-Level Parameterized Model-Order Reduction Approach for Time-Domain Elastodynamics

arXiv (Cornell University), Feb 25, 2020

We present a two-level parameterized Model Order Reduction (pMOR) technique for the linear hyperb... more We present a two-level parameterized Model Order Reduction (pMOR) technique for the linear hyperbolic Partial Differential Equation (PDE) of time-domain elastodynamics. In order to approximate the frequency-domain PDE, we take advantage of the Port-Reduced Reduced-Basis Component (PR-RBC) method to develop (in the offline stage) reduced bases for subdomains; the latter are then assembled (in the online stage) to form the global domains of interest. The PR-RBC approach reduces the effective dimensionality of the parameter space and also provides flexibility in topology and geometry. In the online stage, for each query, we consider a given parameter value and associated global domain. In the first level of reduction, the PR-RBC reduced bases are used to approximate the frequency-domain solution at selected frequencies. In the second level of reduction, these instantiated PR-RBC approximations are used as surrogate truth solutions in a Strong Greedy approach to identify a reduced basis space; the PDE of timedomain elastodynamics is then projected on this reduced space. We provide a numerical example to demonstrate the computational capability and assess the performance of the proposed two-level approach.

Research paper thumbnail of ClimSim: An open large-scale dataset for training high-resolution physics emulators in hybrid multi-scale climate simulators

arXiv (Cornell University), Jun 14, 2023

Modern climate projections lack adequate spatial and temporal resolution due to computational con... more Modern climate projections lack adequate spatial and temporal resolution due to computational constraints. A consequence is inaccurate and imprecise predictions of critical processes such as storms. Hybrid methods that combine physics with machine learning (ML) have introduced a new generation of higher fidelity climate simulators that can sidestep Moore's Law by outsourcing compute-hungry, short, high-resolution simulations to ML emulators. However, this hybrid ML-physics simulation approach requires domain-specific treatment and has been inaccessible to ML experts because of lack of training data and relevant, easy-to-use workflows. We present ClimSim, the largest-ever dataset designed for hybrid ML-physics research. It comprises multi-scale climate simulations, developed by a consortium of climate scientists and ML researchers. It consists of 5.7 billion pairs of multivariate input and output vectors that isolate the influence of locally-nested, high-resolution, high-fidelity physics on a host climate simulator's macro-scale physical state. The dataset is global in coverage, spans multiple years at high sampling frequency, and is designed such that resulting emulators are compatible with downstream coupling into operational climate simulators. We implement a range of deterministic and stochastic regression baselines to highlight the ML challenges and their scoring.

Research paper thumbnail of Memory-based parameterization with differentiable solver: Application to Lorenz ’96

Chaos: An Interdisciplinary Journal of Nonlinear Science

Physical parameterizations (or closures) are used as representations of unresolved subgrid proces... more Physical parameterizations (or closures) are used as representations of unresolved subgrid processes within weather and global climate models or coarse-scale turbulent models, whose resolutions are too coarse to resolve small-scale processes. These parameterizations are typically grounded on physically based, yet empirical, representations of the underlying small-scale processes. Machine learning-based parameterizations have recently been proposed as an alternative solution and have shown great promise to reduce uncertainties associated with the parameterization of small-scale processes. Yet, those approaches still show some important mismatches that are often attributed to the stochasticity of the considered process. This stochasticity can be due to coarse temporal resolution, unresolved variables, or simply to the inherent chaotic nature of the process. To address these issues, we propose a new type of parameterization (closure), which is built using memory-based neural networks, ...

Research paper thumbnail of Scalable Bayesian optimization with high-dimensional outputs using randomized prior networks

arXiv (Cornell University), Feb 14, 2023

Several fundamental problems in science and engineering consist of global optimization tasks invo... more Several fundamental problems in science and engineering consist of global optimization tasks involving unknown high-dimensional (black-box) functions that map a set of controllable variables to the outcomes of an expensive experiment. Bayesian Optimization (BO) techniques are known to be effective in tackling global optimization problems using a relatively small number objective function evaluations, but their performance suffers when dealing with high-dimensional outputs. To overcome the major challenge of dimensionality, here we propose a deep learning framework for BO and sequential decision making based on bootstrapped ensembles of neural architectures with randomized priors. Using appropriate architecture choices, we show that the proposed framework can approximate functional relationships between design variables and quantities of interest, even in cases where the latter take values in high-dimensional vector spaces or even infinite-dimensional function spaces. In the context of BO, we augmented the proposed probabilistic surrogates with reparameterized Monte Carlo approximations of multiple-point (parallel) acquisition functions, as well as methodological extensions for accommodating black-box constraints and multi-fidelity information sources. We test the proposed framework against state-of-the-art methods for BO and demonstrate superior performance across several challenging tasks with high-dimensional outputs, including a constrained optimization task involving shape optimization of rotor blades in turbo-machinery.

Research paper thumbnail of A two-step port-reduced reduced-basis component method for time domain elastodynamic PDE with application to structural health monitoring

Research paper thumbnail of History-Based, Bayesian, Closure for Stochastic Parameterization: Application to Lorenz '96

Cornell University - arXiv, Oct 26, 2022

Physical parameterizations are used as representations of unresolved subgrid processes within wea... more Physical parameterizations are used as representations of unresolved subgrid processes within weather and global climate models or coarse-scale turbulent models, whose resolutions are too coarse to resolve small-scale processes. These parameterizations are typically grounded on physically-based, yet empirical, representations of the underlying small-scale processes. Machine learning-based parameterizations have recently been proposed as an alternative and have shown great promises to reduce uncertainties associated with small-scale processes. Yet, those approaches still show some important mismatches that are often attributed to stochasticity in the considered process. This stochasticity can be due to noisy data, unresolved variables or simply to the inherent chaotic nature of the process. To address these issues, we develop a new type of parameterization (closure) which is based on a Bayesian formalism for neural networks, to account for uncertainty quantification, and includes memory, to account for the non-instantaneous response of the closure. To overcome the curse of dimensionality of Bayesian techniques in high-dimensional spaces, the Bayesian strategy is based on a Hamiltonian Monte Carlo Markov Chain sampling strategy that takes advantage of the likelihood function and kinetic energy's gradients with respect to the parameters to accelerate the sampling process. We apply the proposed Bayesian history-based parameterization to the Lorenz '96 model in the presence of noisy and sparse data, similar to satellite observations, and show its capacity to predict skillful forecasts of the resolved variables while returning trustworthy uncertainty quantifications for different sources of error. This approach paves the way for the use of Bayesian approaches for closure problems. Keywords Stochastic Parameterization • Bayesian Surrogate Models • Uncertainty Quantification • Online Testing • Chaotic Dynamical System • Hamiltonian Monte Carlo Markov Chain • Neural Networks Lead Paragraph Climate models involve physical processes with different scales. Given the available computational resources, small-scale processes are not resolved in global climate models but rather represented by parameterization schemes. Machine learning has been recently used to improve existing parameterization approaches, yet those methods still show some important mismatches that are often attributed to stochasticity in the considered processes. This stochasticity can be due to noisy data, unresolved physical variables or simply to the inherent chaotic nature of the process. In this work, we develop a probabilistic parameterization scheme capable of predicting skillful forecasts of the resolved physical variables while returning trustworthy uncertainty quan