Calibrated Uncertainty Estimation Improves Bayesian Optimization (original) (raw)
Related papers
Calibration Improves Bayesian Optimization
ArXiv, 2021
Bayesian optimization is a procedure that allows obtaining the global optimum of black-box functions and that is useful in applications such as hyper-parameter optimization. Uncertainty estimates over the shape of the objective function are instrumental in guiding the optimization process. However, these estimates can be inaccurate if the objective function violates assumptions made within the underlying model (e.g., Gaussianity). We propose a simple algorithm to calibrate the uncertainty of posterior distributions over the objective function as part of the Bayesian optimization process. We show that by improving the uncertainty estimates of the posterior distribution with calibration, Bayesian optimization makes better decisions and arrives at the global optimum in fewer steps. We show that this technique improves the performance of Bayesian optimization on standard benchmark functions and hyperparameter optimization tasks.
Bayesian Policy Optimization for Model Uncertainty
2019
Addressing uncertainty is critical for autonomous systems to robustly adapt to the real world. We formulate the problem of model uncertainty as a continuous Bayes-Adaptive Markov Decision Process (BAMDP), where an agent maintains a posterior distribution over latent model parameters given a history of observations and maximizes its expected long-term reward with respect to this belief distribution. Our algorithm, Bayesian Policy Optimization, builds on recent policy optimization algorithms to learn a universal policy that navigates the exploration-exploitation trade-off to maximize the Bayesian value function. To address challenges from discretizing the continuous latent parameter space, we propose a new policy network architecture that encodes the belief distribution independently from the observable state. Our method significantly outperforms algorithms that address model uncertainty without explicitly reasoning about belief distributions and is competitive with state-of-the-art P...
Bayesian Optimization Under Uncertainty
2017
We consider the problem of robust optimization, where it is sought to design a system such that it sustains a specified measure of performance under uncertainty. This problem is challenging since modeling a complex system under uncertainty can be expensive and for most real-world problems robust optimization will not be computationally viable. In this paper, we propose a Bayesian methodology to efficiently solve a class of robust optimization problems that arise in engineering design under uncertainty. The central idea is to use Gaussian process models of loss functions (or robustness metrics) together with appropriate acquisition functions to guide the search for a robust optimal solution. Numerical studies on a test problem are presented to demonstrate the efficacy of the proposed approach.
Lifelong Bayesian Optimization
2019
Automatic Machine Learning (Auto-ML) systems tackle the problem of automating the design of prediction models or pipelines for data science. In this paper, we present Lifelong Bayesian Optimization (LBO), an online, multitask Bayesian optimization (BO) algorithm designed to solve the problem of model selection for datasets arriving and evolving over time. To be suitable for "lifelong" Bayesian optimization, an algorithm needs to scale with the ever increasing number of acquisitions and should be able to leverage past optimizations in learning the current best model. In LBO, we exploit the correlation between black-box functions by using components of previously learned functions to speed up the learning process for newly arriving datasets. Experiments on real and synthetic data show that LBO outperforms standard BO algorithms applied repeatedly on the data. Preprint. Under review.
Pareto-efficient Acquisition Functions for Cost-Aware Bayesian Optimization
ArXiv, 2020
Bayesian optimization (BO) is a popular method to optimize expensive black-box functions. It efficiently tunes machine learning algorithms under the implicit assumption that hyperparameter evaluations cost approximately the same. In reality, the cost of evaluating different hyperparameters, be it in terms of time, dollars or energy, can span several orders of magnitude of difference. While a number of heuristics have been proposed to make BO cost-aware, none of these have been proven to work robustly. In this work, we reformulate cost-aware BO in terms of Pareto efficiency and introduce the cost Pareto Front, a mathematical object allowing us to highlight the shortcomings of commonly used acquisition functions. Based on this, we propose a novel Pareto-efficient adaptation of the expected improvement. On 144 real-world black-box function optimization problems we show that our Pareto-efficient acquisition functions significantly outperform previous solutions, bringing up to 50% speed-...
An Empirical Study of Assumptions in Bayesian Optimisation
2020
In this work we rigorously analyse assumptions inherent to black-box optimisation hyper-parameter tuning tasks. Our results on the Bayesmark benchmark indicate that heteroscedasticity and non-stationarity pose significant challenges for black-box optimisers. Based on these findings, we propose a Heteroscedastic and Evolutionary Bayesian Optimisation solver (HEBO). HEBO performs non-linear input and output warping, admits exact marginal log-likelihood optimisation and is robust to the values of learned parameters. We demonstrate HEBO’s empirical efficacy on the NeurIPS 2020 Black-Box Optimisation challenge, where HEBO placed first. Upon further analysis, we observe that HEBO significantly outperforms existing black-box optimisers on 108 machine learning hyperparameter tuning tasks comprising the Bayesmark benchmark. Our findings indicate that the majority of hyper-parameter tuning tasks exhibit heteroscedasticity and non-stationarity, multi-objective acquisition ensembles with Pareto...
Adaptive Bayesian optimization for dynamic problems
2018
This thesis studies the problem of tracking the extremum of an objective function that is latent, noisy and expensive to evaluate. This problem is notable because many large-scale learning systems with complex models operating on non-stationary data have meta-problems whose solutions require the tracking of an evolving extremum. We start by describing dynamic optimization problems and model them using spatiotemporal Gaussian process priors. We construct an intelligent search mechanism that uses the learnt insights to skillfully guide the search by dynamically modifying the feasible search region as a device to keep track of the evolution. We also show that this mechanism induces a natural approximation scheme for cases where the number of samples for the model becomes too expensive for inference. We test the resulting method on synthetic and real-world problems. In the next part of the thesis, we demonstrate the utility of the method on pertinent real-world meta-problems occurring i...
HEBO: An Empirical Study of Assumptions in Bayesian Optimisation
Journal of Artificial Intelligence Research
In this work we rigorously analyse assumptions inherent to black-box optimisation hyper-parameter tuning tasks. Our results on the Bayesmark benchmark indicate that heteroscedasticity and non-stationarity pose significant challenges for black-box optimisers. Based on these findings, we propose a Heteroscedastic and Evolutionary Bayesian Optimisation solver (HEBO). HEBO performs non-linear input and output warping, admits exact marginal log-likelihood optimisation and is robust to the values of learned parameters. We demonstrate HEBO’s empirical efficacy on the NeurIPS 2020 Black-Box Optimisation challenge, where HEBO placed first. Upon further analysis, we observe that HEBO significantly outperforms existing black-box optimisers on 108 machine learning hyperparameter tuning tasks comprising the Bayesmark benchmark. Our findings indicate that the majority of hyper-parameter tuning tasks exhibit heteroscedasticity and non-stationarity, multiobjective acquisition ensembles with Pareto ...
Bayesian optimization with informative parametric models via sequential Monte Carlo
Data-Centric Engineering, 2022
Bayesian optimization (BO) has been a successful approach to optimize expensive functions whose prior knowledge can be specified by means of a probabilistic model. Due to their expressiveness and tractable closed-form predictive distributions, Gaussian process (GP) surrogate models have been the default go-to choice when deriving BO frameworks. However, as nonparametric models, GPs offer very little in terms of interpretability and informative power when applied to model complex physical phenomena in scientific applications. In addition, the Gaussian assumption also limits the applicability of GPs to problems where the variables of interest may highly deviate from Gaussianity. In this article, we investigate an alternative modeling framework for BO which makes use of sequential Monte Carlo (SMC) to perform Bayesian inference with parametric models. We propose a BO algorithm to take advantage of SMC’s flexible posterior representations and provide methods to compensate for bias in th...
Bayesian Optimization Under Uncertainty for Chance Constrained Problems
2019
Chance constraint is an important tool for modeling the reliability on decision making in the presence of uncertainties. Indeed, the chance constraint enforces that the constraint is satisfied with probability 1 − α ( 0 < α < 1 ) at least. In addition, we consider that the objective func- tion is affected by uncertainties. This problem is challenging since modeling a complex system under uncertainty can be expensive and for most real-world stochastic optimization will not be computationally viable. In this talk, we propose a Bayesian methodology to efficiently solve such class of problems. The central idea is to use Gaussian Process (GP) models [1] together with appropriate acquisi- tion functions to guide the search for an optimal solution. We first show that by specifying a GP prior to the objective function, the loss function becomes tractable [2]. Similarly, using GP models for the constraints, the probability satisfaction can be efficiently approximated. Sub- sequently, w...