First-Order Optimality Measure - MATLAB & Simulink (original) (raw)
What Is First-Order Optimality Measure?
First-order optimality is a measure of how close a point x is to optimal. Most Optimization Toolbox™ solvers use this measure, though it has different definitions for different algorithms. First-order optimality is a necessary condition, but it is not a sufficient condition. In other words:
- The first-order optimality measure must be zero at a minimum.
- A point with first-order optimality equal to zero is not necessarily a minimum.
For general information about first-order optimality, see Nocedal and Wright [31]. For specifics about the first-order optimality measures for Optimization Toolbox solvers, see Unconstrained Optimality, Constrained Optimality Theory, and Constrained Optimality in Solver Form.
Stopping Rules Related to First-Order Optimality
The OptimalityTolerance
tolerance relates to the first-order optimality measure. Typically, if the first-order optimality measure is less than OptimalityTolerance
, solver iterations end.
Some solvers or algorithms use relative first-order optimality as a stopping criterion. Solver iterations end if the first-order optimality measure is less than μ times OptimalityTolerance
, where μ is either:
- The infinity norm (maximum) of the gradient of the objective function at
x0
- The infinity norm (maximum) of inputs to the solver, such as
f
orb
inlinprog
orH
inquadprog
A relative measure attempts to account for the scale of a problem. Multiplying an objective function by a very large or small number does not change the stopping condition for a relative stopping criterion, but does change it for an unscaled one.
Solvers with enhanced exit messages state, in the stopping criteria details, when they use relative first-order optimality.
Unconstrained Optimality
For a smooth unconstrained problem,
the first-order optimality measure is the infinity norm (meaning maximum absolute value) of ∇f(x), which is:
This measure of optimality is based on the familiar condition for a smooth function to achieve a minimum: its gradient must be zero. For unconstrained problems, when the first-order optimality measure is nearly zero, the objective function has gradient nearly zero, so the objective function could be near a minimum. If the first-order optimality measure is not small, the objective function is not minimal.
Constrained Optimality Theory
This section summarizes the theory behind the definition of first-order optimality measure for constrained problems. The definition as used in Optimization Toolbox functions is in Constrained Optimality in Solver Form.
For a smooth constrained problem, let g and h be vector functions representing all inequality and equality constraints respectively (meaning bound, linear, and nonlinear constraints):
The meaning of first-order optimality in this case is more complex than for unconstrained problems. The definition is based on the Karush-Kuhn-Tucker (KKT) conditions. The KKT conditions are analogous to the condition that the gradient must be zero at a minimum, modified to take constraints into account. The difference is that the KKT conditions hold for constrained problems.
The KKT conditions use the auxiliary Lagrangian function:
L(x,λ)=f(x)+∑λg,igi(x)+∑λh,ihi(x). | (1) |
---|
The vector λ, which is the concatenation of λg and λh, is the Lagrange multiplier vector. Its length is the total number of constraints.
The KKT conditions are:
{g(x)≤0,h(x)=0,λg,i≥0. | (4) |
---|
Solvers do not use the three expressions in Equation 4 in the calculation of optimality measure.
The optimality measure associated with Equation 2 is
‖∇xL(x,λ)‖=‖∇f(x)+∑λg,i∇gi(x)+∑λh,i∇hh,i(x)‖. | (5) |
---|
The optimality measure associated with Equation 3 is where the norm in Equation 6 means infinity norm (maximum) of the vector λg,igi→(x).
The combined optimality measure is the maximum of the values calculated in Equation 5 and Equation 6. Solvers that accept nonlinear constraint functions report constraint violations g(x) > 0 or |h(x)| > 0 as ConstraintTolerance
violations. See Tolerances and Stopping Criteria.
Constrained Optimality in Solver Form
Most constrained toolbox solvers separate their calculation of first-order optimality measure into bounds, linear functions, and nonlinear functions. The measure is the maximum of the following two norms, which correspond to Equation 5 and Equation 6:
‖∇xL(x,λ)‖=‖∇f(x)+ATλineqlin+AeqTλeqlin +∑λineqnonlin,i∇ci(x)+∑λeqnonlin,i∇ceqi(x)‖, | (7) |
---|
| ‖|li−xi|λlower,i→,|xi−ui|λupper,i→,|(Ax−b)i|λineqlin,i→,|ci(x)|λineqnonlin,i→‖, | (8) | | -------------------------------------------------------------------------------- | ------- |
where the norm of the vectors in Equation 7 and Equation 8 is the infinity norm (maximum). The subscripts on the Lagrange multipliers correspond to solver Lagrange multiplier structures. See Lagrange Multiplier Structures. The summations in Equation 7 range over all constraints. If a bound is ±Inf
, that term is not constrained, so it is not part of the summation.
Linear Equalities Only
For some large-scale problems with only linear equalities, the first-order optimality measure is the infinity norm of the projected gradient. In other words, the first-order optimality measure is the size of the gradient projected onto the null space of Aeq
.
Bounded Least-Squares and Trust-Region-Reflective Solvers
For least-squares solvers and trust-region-reflective algorithms, in problems with bounds alone, the first-order optimality measure is the maximum over i of |vi*gi|. Here gi is the _i_th component of the gradient, x is the current point, and
If xi is at a bound, vi is zero. If xi is not at a bound, then at a minimizing point the gradient gi should be zero. Therefore the first-order optimality measure should be zero at a minimizing point.