Optimization Toolbox | ![]() ![]() |
Quadratic Minimization with a Dense but Structured Hessian
The quadprog
large-scale method can also solve large problems where the Hessian is dense but structured. For these problems, quadprog
does not compute H*Y with the Hessian H directly, as it does for medium-scale problems and for large-scale problems with sparse H, because forming H would be memory-intensive. Instead, you must provide quadprog
with a function that, given a matrix Y and information about H, computes W = H*Y.
In this example, the Hessian matrix H
has the structure H = B + A*A'
where B
is a sparse 512-by-512 symmetric matrix, and A
is a 512-by-10 sparse matrix composed of a number of dense columns. To avoid excessive memory usage that could happen by working with H
directly because H
is dense, the example provides a Hessian multiply function, qpbox4mult
. This function, when passed a matrix Y
, uses sparse matrices A
and B
to compute the Hessian matrix product W = H*Y = (B + A*A')*Y
.
In this example, the matrices A
and B
need to be passed to the Hessian multiply function qpbox4mult
from quadprog
. There are two ways to indicate this in the call to quadprog
. The first argument passed to quadprog
is passed to the Hessian multiply function. Also, parameters passed to quadprog
as additional parameters are passed to the Hessian multiply function.
Step 1: Decide what part of H to pass to quadprog as the first argument.
Either A
, or B
can be passed as the first argument to quadprog
. The example chooses to pass B
as the first argument because this results in a better preconditioner (see Preconditioning). A
is then passed as an additional parameter:
Step 2: Write a function to compute Hessian-matrix products for H.
Now, define a function qpbox4mult
that uses A
and B
to compute the Hessian matrix product W
where W = H*Y = (B + A*A')*Y
. This function must have the form
qpbox4mult
must accept the same first argument as passed to quadprog
, e.g., the example passes B
as the first argument to quadprog
, so qpbox4mult
must accept B
as the first argument.
The second argument to the Hessian multiply function is the matrix Y
(of W = H*Y
). Because quadprog
expects Y
to be used to form the Hessian matrix product, Y
is always a matrix with n
rows where n
is the number of dimensions in the problem. The number of columns in Y
can vary. Finally, any additional parameters passed to quadprog
are passed to the Hessian multiply function, so qpbox4mult
must accept the same additional parameters, e.g., the matrix A
:
function W = qpbox4mult(B,Y,A); %QPBOX4MULT Hessian matrix product with dense structured Hessian. % W = qpbox4mult(B,Y,A) computes W = (B + A*A')*Y where % INPUT: % B - sparse square matrix (512 by 512) % Y - vector (or matrix) to be multiplied by B + A'*A. % A - sparse matrix with 512 rows and 10 columns. % % OUTPUT: % W - The product (B + A*A')*Y. % Order multiplies to avoid forming A*A', % which is large and dense W = B*Y + A*(A'*Y);
Step 3: Call a quadratic minimization routine with a starting point.
Load the problem parameters from qpbox4.mat
. Use optimset
to set the HessMult
option to a function handle that points to qpbox4mult
. Call quadprog
with B
as the first argument and A
as an additional parameter:
load qpbox4 % Get xstart, u, l, B, A, f mtxmpy = @qpbox4mult; % Function handle to function qpbox4mult options = optimset('HessMult',mtxmpy); [x,fval,exitflag,output] = quadprog(B,f,[],[],[],[],l,u,... xstart,options,A); Optimization terminated successfully: Relative function value changing by less than sqrt(OPTIONS.TolFun), no negative curvature detected in Hessian this iteration, and the rate of progress (change in f(x)) is slow
After 18 iterations with a total of 30 PCG iterations, the function value is reduced to
and the first-order optimality is
Note
Type [fval,exitflag,output] = runqpbox4 to run the preceding code and display the values for fval , exitflag , and output .
|
Preconditioning
In this example, quadprog
cannot use H
to compute a preconditioner because H
only exists implicitly. Instead, quadprog
uses B
, the argument passed in instead of H
, to compute a preconditioner. B
is a good choice because it is the same size as H
and approximates H
to some degree. If B
were not the same size as H
, quadprog
would compute a preconditioner based on some diagonal scaling matrices determined from the algorithm. Typically, this would not perform as well.
Because the preconditioner is more approximate than when H
is available explicitly, adjusting the TolPcg
parameter to a somewhat smaller value might be required. This example is the same as the previous one, but reduces TolPcg
from the default 0.1 to 0.01.
options = optimset('HessMult',mtxmpy,'TolPcg',0.01); [x,fval,exitflag,output]=quadprog(B,f,[],[],[],[],l,u,xstart,... options,A); Optimization terminated successfully: Relative function value changing by less than sqrt(OPTIONS.TolFun), no negative curvature detected in Hessian this iteration, and the rate of progress (change in f(x)) is slow
After 18 iterations and 50 PCG iterations, the function value has the same value to five significant digits
but the first-order optimality is further reduced.
![]() | Quadratic Minimization with Bound Constraints | Linear Least-Squares with Bound Constraints | ![]() |