Zikai Xie - Academia.edu (original) (raw)
Papers by Zikai Xie
Frontiers in artificial intelligence and applications, Sep 27, 2023
In this paper we propose DKIBO, a Bayesian optimization (BO) algorithm that accommodates domain k... more In this paper we propose DKIBO, a Bayesian optimization (BO) algorithm that accommodates domain knowledge to tune exploration in the search space. Bayesian optimization has recently emerged as a sample-efficient optimizer for many intractable scientific problems. While various existing BO frameworks allow the input of prior beliefs to accelerate the search by narrowing down the space, incorporating such knowledge is not always straightforward and can often introduce bias and lead to poor performance. Here we propose a simple approach to incorporate structural knowledge in the acquisition function by utilizing an additional deterministic surrogate model to enrich the approximation power of the Gaussian process. This is suitably chosen according to structural information of the problem at hand and acts a corrective term towards a betterinformed sampling. We empirically demonstrate the practical utility of the proposed method by successfully injecting domain knowledge in a materials design task. We further validate our method's performance on different experimental settings and ablation analyses.
Chemical Science, Dec 31, 2023
We evaluate the effectiveness of fine-tuning GPT-3 for the prediction of electronic and functiona... more We evaluate the effectiveness of fine-tuning GPT-3 for the prediction of electronic and functional properties of organic molecules. Our findings show that fine-tuned GPT-3 can successfully identify and distinguish between chemically meaningful patterns, and discern subtle differences among them, exhibiting robust predictive performance for the prediction of molecular properties. We focus on assessing the fine-tuned models' resilience to information loss, resulting from the absence of atoms or chemical groups, and to noise that we introduce via random alterations in atomic identities. We discuss the challenges and limitations inherent to the use of GPT-3 in molecular machine-learning tasks and suggest potential directions for future research and improvements to address these issues.
Advances in Civil Engineering, Feb 3, 2022
Probabilistic seismic demand model (PSDM) is one of the critical components of performance-based ... more Probabilistic seismic demand model (PSDM) is one of the critical components of performance-based earthquake engineering frameworks. e aim of this study is to propose a procedure to generate PSDMs for a typical regular continuous-girder bridge subjected to far and near-fault ground motions (GMs) utilizing machine-learning methods. A series of nonlinear time history analyses (NTHAs) is carried out to calculate the damage caused by the far and near-fault GMs for four different site conditions, and 21 seismic intensity measures (IMs) are considered. Subsequently, PSDMs are established for the IMs and engineering demand parameters based on the existing NTHA data using machine-learning methods, which include linear regression, Bayesian regression (BR), and a tree-based model. e results indicated that random forest (RF) is the most suitable model to predict the longitudinal and transverse curvature at the bottom of the four piers from the coefficients of determination. More specifically, the relative importance of each parameter in the model is evaluated, and peak ground velocity (PGV), peak spectral velocity (PSV), Arias intensity (AI), and Fajfar intensity (FI) are found to be the critical factors for the RF-based PSDM. Finally, all of these parameters, except AI, are correlated with velocity. e research results explore a new method for establishing the seismic demand model of continuous-girder bridges, which can provide suggestions for seismic damage prediction and seismic insurance risk evaluation.
26th European Conference on Artificial Intelligence ECAI 2023, 2023
In this paper we propose DKIBO, a Bayesian optimization (BO) algorithm that accommodates domain k... more In this paper we propose DKIBO, a Bayesian optimization (BO) algorithm that accommodates domain knowledge to tune exploration in the search space. Bayesian optimization has recently emerged as a sample-efficient optimizer for many intractable scientific problems. While various existing BO frameworks allow the input of prior beliefs to accelerate the search by narrowing down the space, incorporating such knowledge is not always straightforward and can often introduce bias and lead to poor performance. Here we propose a simple approach to incorporate structural knowledge in the acquisition function by utilizing an additional deterministic surrogate model to enrich the approximation power of the Gaussian process. This is suitably chosen according to structural information of the problem at hand and acts a corrective term towards a betterinformed sampling. We empirically demonstrate the practical utility of the proposed method by successfully injecting domain knowledge in a materials design task. We further validate our method's performance on different experimental settings and ablation analyses.
We evaluate the effectiveness of fine-tuning GPT-3 for the prediction of electronic and functiona... more We evaluate the effectiveness of fine-tuning GPT-3 for the prediction of electronic and functional properties of organic molecules. Our findings show that fine-tuned GPT-3 can successfully identify and distinguish between chemically meaningful patterns, and discern subtle differences among them, exhibiting robust predictive performance for the prediction of molecular properties. We focus on assessing the fine-tuned models' resilience to information loss, resulting from the absence of atoms or chemical groups, and to noise that we introduce via random alterations in atomic identities. We discuss the challenges and limitations inherent to the use of GPT-3 in molecular machine-learning tasks and suggest potential directions for future research and improvements to address these issues.
Advances in Civil Engineering, 2022
Probabilistic seismic demand model (PSDM) is one of the critical components of performance-based ... more Probabilistic seismic demand model (PSDM) is one of the critical components of performance-based earthquake engineering frameworks. The aim of this study is to propose a procedure to generate PSDMs for a typical regular continuous-girder bridge subjected to far and near-fault ground motions (GMs) utilizing machine-learning methods. A series of nonlinear time history analyses (NTHAs) is carried out to calculate the damage caused by the far and near-fault GMs for four different site conditions, and 21 seismic intensity measures (IMs) are considered. Subsequently, PSDMs are established for the IMs and engineering demand parameters based on the existing NTHA data using machine-learning methods, which include linear regression, Bayesian regression (BR), and a tree-based model. The results indicated that random forest (RF) is the most suitable model to predict the longitudinal and transverse curvature at the bottom of the four piers from the coefficients of determination. More specificall...
Frontiers in artificial intelligence and applications, Sep 27, 2023
In this paper we propose DKIBO, a Bayesian optimization (BO) algorithm that accommodates domain k... more In this paper we propose DKIBO, a Bayesian optimization (BO) algorithm that accommodates domain knowledge to tune exploration in the search space. Bayesian optimization has recently emerged as a sample-efficient optimizer for many intractable scientific problems. While various existing BO frameworks allow the input of prior beliefs to accelerate the search by narrowing down the space, incorporating such knowledge is not always straightforward and can often introduce bias and lead to poor performance. Here we propose a simple approach to incorporate structural knowledge in the acquisition function by utilizing an additional deterministic surrogate model to enrich the approximation power of the Gaussian process. This is suitably chosen according to structural information of the problem at hand and acts a corrective term towards a betterinformed sampling. We empirically demonstrate the practical utility of the proposed method by successfully injecting domain knowledge in a materials design task. We further validate our method's performance on different experimental settings and ablation analyses.
Chemical Science, Dec 31, 2023
We evaluate the effectiveness of fine-tuning GPT-3 for the prediction of electronic and functiona... more We evaluate the effectiveness of fine-tuning GPT-3 for the prediction of electronic and functional properties of organic molecules. Our findings show that fine-tuned GPT-3 can successfully identify and distinguish between chemically meaningful patterns, and discern subtle differences among them, exhibiting robust predictive performance for the prediction of molecular properties. We focus on assessing the fine-tuned models' resilience to information loss, resulting from the absence of atoms or chemical groups, and to noise that we introduce via random alterations in atomic identities. We discuss the challenges and limitations inherent to the use of GPT-3 in molecular machine-learning tasks and suggest potential directions for future research and improvements to address these issues.
Advances in Civil Engineering, Feb 3, 2022
Probabilistic seismic demand model (PSDM) is one of the critical components of performance-based ... more Probabilistic seismic demand model (PSDM) is one of the critical components of performance-based earthquake engineering frameworks. e aim of this study is to propose a procedure to generate PSDMs for a typical regular continuous-girder bridge subjected to far and near-fault ground motions (GMs) utilizing machine-learning methods. A series of nonlinear time history analyses (NTHAs) is carried out to calculate the damage caused by the far and near-fault GMs for four different site conditions, and 21 seismic intensity measures (IMs) are considered. Subsequently, PSDMs are established for the IMs and engineering demand parameters based on the existing NTHA data using machine-learning methods, which include linear regression, Bayesian regression (BR), and a tree-based model. e results indicated that random forest (RF) is the most suitable model to predict the longitudinal and transverse curvature at the bottom of the four piers from the coefficients of determination. More specifically, the relative importance of each parameter in the model is evaluated, and peak ground velocity (PGV), peak spectral velocity (PSV), Arias intensity (AI), and Fajfar intensity (FI) are found to be the critical factors for the RF-based PSDM. Finally, all of these parameters, except AI, are correlated with velocity. e research results explore a new method for establishing the seismic demand model of continuous-girder bridges, which can provide suggestions for seismic damage prediction and seismic insurance risk evaluation.
26th European Conference on Artificial Intelligence ECAI 2023, 2023
In this paper we propose DKIBO, a Bayesian optimization (BO) algorithm that accommodates domain k... more In this paper we propose DKIBO, a Bayesian optimization (BO) algorithm that accommodates domain knowledge to tune exploration in the search space. Bayesian optimization has recently emerged as a sample-efficient optimizer for many intractable scientific problems. While various existing BO frameworks allow the input of prior beliefs to accelerate the search by narrowing down the space, incorporating such knowledge is not always straightforward and can often introduce bias and lead to poor performance. Here we propose a simple approach to incorporate structural knowledge in the acquisition function by utilizing an additional deterministic surrogate model to enrich the approximation power of the Gaussian process. This is suitably chosen according to structural information of the problem at hand and acts a corrective term towards a betterinformed sampling. We empirically demonstrate the practical utility of the proposed method by successfully injecting domain knowledge in a materials design task. We further validate our method's performance on different experimental settings and ablation analyses.
We evaluate the effectiveness of fine-tuning GPT-3 for the prediction of electronic and functiona... more We evaluate the effectiveness of fine-tuning GPT-3 for the prediction of electronic and functional properties of organic molecules. Our findings show that fine-tuned GPT-3 can successfully identify and distinguish between chemically meaningful patterns, and discern subtle differences among them, exhibiting robust predictive performance for the prediction of molecular properties. We focus on assessing the fine-tuned models' resilience to information loss, resulting from the absence of atoms or chemical groups, and to noise that we introduce via random alterations in atomic identities. We discuss the challenges and limitations inherent to the use of GPT-3 in molecular machine-learning tasks and suggest potential directions for future research and improvements to address these issues.
Advances in Civil Engineering, 2022
Probabilistic seismic demand model (PSDM) is one of the critical components of performance-based ... more Probabilistic seismic demand model (PSDM) is one of the critical components of performance-based earthquake engineering frameworks. The aim of this study is to propose a procedure to generate PSDMs for a typical regular continuous-girder bridge subjected to far and near-fault ground motions (GMs) utilizing machine-learning methods. A series of nonlinear time history analyses (NTHAs) is carried out to calculate the damage caused by the far and near-fault GMs for four different site conditions, and 21 seismic intensity measures (IMs) are considered. Subsequently, PSDMs are established for the IMs and engineering demand parameters based on the existing NTHA data using machine-learning methods, which include linear regression, Bayesian regression (BR), and a tree-based model. The results indicated that random forest (RF) is the most suitable model to predict the longitudinal and transverse curvature at the bottom of the four piers from the coefficients of determination. More specificall...