A Constrained Optimization Approach to Bilevel Optimization with Multiple Inner Minima (original) (raw)
Related papers
Bilevel Optimization and Machine Learning
2008
We examine the interplay of optimization and machine learning. Great progress has been made in machine learning by cleverly reducing machine learning problems to convex optimization problems with one or more hyper-parameters. The availability of powerful convex-programming theory and algorithms has enabled a flood of new research in machine learning models and methods. But many of the steps necessary for successful machine learning models fall outside of the convex machine learning paradigm. Thus we now propose framing machine learning problems as Stackelberg games. The resulting bilevel optimization problem allows for efficient systematic search of large numbers of hyper-parameters. We discuss recent progress in solving these bilevel problems and the many interesting optimization challenges that remain. Finally, we investigate the intriguing possibility of novel machine learning models enabled by bilevel programming.
A Gradient-based Bilevel Optimization Approach for Tuning Hyperparameters in Machine Learning
ArXiv, 2020
Hyperparameter tuning is an active area of research in machine learning, where the aim is to identify the optimal hyperparameters that provide the best performance on the validation set. Hyperparameter tuning is often achieved using naive techniques, such as random search and grid search. However, most of these methods seldom lead to an optimal set of hyperparameters and often get very expensive. In this paper, we propose a bilevel solution method for solving the hyperparameter optimization problem that does not suffer from the drawbacks of the earlier studies. The proposed method is general and can be easily applied to any class of machine learning algorithms. The idea is based on the approximation of the lower level optimal value function mapping, which is an important mapping in bilevel optimization and helps in reducing the bilevel problem to a single level constrained optimization task. The single-level constrained optimization problem is solved using the augmented Lagrangian m...
An overview of bilevel optimization
Annals of Operations Research, 2007
This paper is devoted to bilevel optimization, a branch of mathematical programming of both practical and theoretical interest. Starting with a simple example, we proceed towards a general formulation. We then present fields of application, focus on solution approaches, and make the connection with MPECs (Mathematical Programs with Equilibrium Constraints). Keywords Bilevel programming • Mathematical programs with equilibrium constraints • Nonlinear programming • Optimal pricing 1 Introduction This paper is devoted to bilevel optimization. Its purpose is to provide the reader with the key concepts, applications and solution methods associated with this class of hierarchical mathematical programs. It is an updated version of the survey of Colson et al. (2005b) that originally appeared in 4OR. Apart from minor modifications to the text, this version includes B. Colson now at SAMTECH s.a., Liège, Belgium.
Penalty Method for Inversion-Free Deep Bilevel Optimization
2019
Bilevel optimization problems are at the center of several important machine learning problems such as hyperparameter tuning, data denoising, meta- and few-shot learning, and training-data poisoning. Different from simultaneous or multi-objective optimization, the steepest descent direction for minimizing the upper-level cost requires the inverse of the Hessian of the lower-level cost. In this paper, we propose a new method for solving bilevel optimization problems using the classical penalty function approach which avoids computing the inverse and can also handle additional constraints easily. We prove the convergence of the method under mild conditions and show that the exact hypergradient is obtained asymptotically. Our method's simplicity and small space and time complexities enable us to effectively solve large-scale bilevel problems involving deep neural networks. We present results on data denoising, few-shot learning, and training-data poisoning problems in a large scale...
A Novel Hessian-Free Bilevel Optimizer via Evolution Strategies
2021
Bilevel optimization has arisen as a powerful tool for solving many modern machine learning problems. However, due to the nested structure of bilevel optimization, even gradient-based methods require secondorder derivative approximations via Jacobianor/and Hessian-vector computations, which can be very costly in practice. In this work, we propose a novel Hessian-free bilevel algorithm, which adopts the Evolution Strategies (ES) method to approximate the response Jacobian matrix in the hypergradient of the bilevel problem, and hence fully eliminates all second-order computations. We call our algorithm as ESJ (which stands for the ES-based Jacobian method) and further extend it to the stochastic setting as ESJ-S. Theoretically, we show that both ESJ and ESJ-S are guaranteed to converge. Experimentally, we demonstrate that the proposed algorithms outperform baseline bilevel optimizers on various bilevel problems. Particularly, in our experiment on few-shot meta-learning of ResNet-12 ne...
On the solution of convex bilevel optimization problems
Computational Optimization and Applications, 2015
An algorithm is presented for solving bilevel optimization problems with fully convex lower level problems. Convergence to a local optimal solution is shown under certain weak assumptions. This algorithm uses the optimal value transformation of the problem. Transformation of the bilevel optimization problem using the Fritz-John necessary optimality conditions applied to the lower level problem is shown to exhibit almost the same difficulties for solving the problem as the use of the Karush-Kuhn-Tucker conditions.
Proximal Gradient Method for Solving Bilevel Optimization Problems
Mathematical and Computational Applications, 2020
In this paper, we consider a bilevel optimization problem as a task of finding the optimum of the upper-level problem subject to the solution set of the split feasibility problem of fixed point problems and optimization problems. Based on proximal and gradient methods, we propose a strongly convergent iterative algorithm with an inertia effect solving the bilevel optimization problem under our consideration. Furthermore, we present a numerical example of our algorithm to illustrate its applicability.
An inexact-restoration method for nonlinear bilevel programming problems
Computational Optimization and Applications, 2007
We present a new algorithm for solving bilevel programming problems without reformulating them as single-level nonlinear programming problems. This strategy allows one to take profit of the structure of the lower level optimization problems without using non-differentiable methods. The algorithm is based on the inexact-restoration technique. Under some assumptions on the problem we prove global convergence to feasible points that satisfy the approximate gradient projection (AGP) optimality condition. Computational experiments are presented that encourage the use of this method for general bilevel problems.
ES-Based Jacobian Enables Faster Bilevel Optimization
ArXiv, 2021
Bilevel optimization (BO) has arisen as a powerful tool for solving many modern machine learning problems. However, due to the nested structure of BO, existing gradient-based methods require secondorder derivative approximations via Jacobianor/and Hessian-vector computations, which can be very costly in practice, especially with large neural network models. In this work, we propose a novel BO algorithm, which adopts Evolution Strategies (ES) based method to approximate the response Jacobian matrix in the hypergradient of BO, and hence fully eliminates all second-order computations. We call our algorithm as ESJ (which stands for the ES-based Jacobian method) and further extend it to the stochastic setting as ESJ-S. Theoretically, we characterize the convergence guarantee and computational complexity for our algorithms. Experimentally, we demonstrate the superiority of our proposed algorithms compared to the state of the art methods on various bilevel problems. Particularly, in our ex...
A Global Optimization Method for Solving Convex Quadratic Bilevel Programming Problems
Journal of Global Optimization, 2003
We use the merit function technique to formulate a linearly constrained bilevel convex quadratic problem as a convex program with an additional convex-d.c. constraint. To solve the latter problem we approximate it by convex programs with an additional convex-concave constraint using an adaptive simplicial subdivision. This approximation leads to a branch-and-bound algorithm for finding a global optimal solution to the