Tutorial (Optimization Toolbox)

Optimization Toolbox

Quadratic Minimization with a Dense but Structured Hessian

The quadprog large-scale method can also solve large problems where the Hessian is dense but structured. For these problems, quadprog does not compute H*Y with the Hessian H directly, as it does for medium-scale problems and for large-scale problems with sparse H, because forming H would be memory-intensive. Instead, you must provide quadprog with a function that, given a matrix Y and information about H, computes W = H*Y.

In this example, the Hessian matrix H has the structure H = B + A*A' where B is a sparse 512-by-512 symmetric matrix, and A is a 512-by-10 sparse matrix composed of a number of dense columns. To avoid excessive memory usage that could happen by working with H directly because H is dense, the example provides a Hessian multiply function, qpbox4mult. This function, when passed a matrix Y, uses sparse matrices A and B to compute the Hessian matrix product W = H*Y = (B + A*A')*Y.

In this example, the matrices A and B need to be passed to the Hessian multiply function qpbox4mult from quadprog. There are two ways to indicate this in the call to quadprog. The first argument passed to quadprog is passed to the Hessian multiply function. Also, parameters passed to quadprog as additional parameters are passed to the Hessian multiply function.

Step 1: Decide what part of H to pass to quadprog as the first argument.

Either A, or B can be passed as the first argument to quadprog. The example chooses to pass B as the first argument because this results in a better preconditioner (see Preconditioning). A is then passed as an additional parameter:

quadprog(B,f,[],[],[],[],l,u,xstart,options,A)

Step 2: Write a function to compute Hessian-matrix products for H.

Now, define a function qpbox4mult that uses A and B to compute the Hessian matrix product W where W = H*Y = (B + A*A')*Y. This function must have the form

```
W = qpbox4mult(Hinfo,Y,p1,p2...)
```

qpbox4mult must accept the same first argument as passed to quadprog, e.g., the example passes B as the first argument to quadprog, so qpbox4mult must accept B as the first argument.

The second argument to the Hessian multiply function is the matrix Y (of W = H*Y). Because quadprog expects Y to be used to form the Hessian matrix product, Y is always a matrix with n rows where n is the number of dimensions in the problem. The number of columns in Y can vary. Finally, any additional parameters passed to quadprog are passed to the Hessian multiply function, so qpbox4mult must accept the same additional parameters, e.g., the matrix A:

function W = qpbox4mult(B,Y,A); 
%QPBOX4MULT Hessian matrix product with dense structured Hessian. 
% W = qpbox4mult(B,Y,A) computes W = (B + A*A')*Y where 
% INPUT: 
% B - sparse square matrix (512 by 512) 
% Y - vector (or matrix) to be multiplied by B + A'*A. 
% A - sparse matrix with 512 rows and 10 columns. 
% 
% OUTPUT: 
% W - The product (B + A*A')*Y. 

% Order multiplies to avoid forming A*A', 
% which is large and dense 
W = B*Y + A*(A'*Y);

Note qpbox4mult is a subfunction of runqpbox4.m in the Optimization Toolbox.

Step 3: Call a quadratic minimization routine with a starting point.

Load the problem parameters from qpbox4.mat. Use optimset to set the HessMult option to a function handle that points to qpbox4mult. Call quadprog with B as the first argument and A as an additional parameter:

load qpbox4             % Get xstart, u, l, B, A, f
mtxmpy = @qpbox4mult;   % Function handle to function qpbox4mult
options = optimset('HessMult',mtxmpy); 
[x,fval,exitflag,output] = quadprog(B,f,[],[],[],[],l,u,...
                                    xstart,options,A);
Optimization terminated successfully: 
Relative function value changing by less than
sqrt(OPTIONS.TolFun), no negative curvature detected in Hessian
this iteration, and the rate of progress (change in f(x)) is slow

After 18 iterations with a total of 30 PCG iterations, the function value is reduced to

```
fval = 
          -1.0538e+003
```

and the first-order optimality is

output.firstorderopt = 
           0.0043

Note Type [fval,exitflag,output] = runqpbox4 to run the preceding code and display the values for fval, exitflag, and output.

Preconditioning

In this example, quadprog cannot use H to compute a preconditioner because H only exists implicitly. Instead, quadprog uses B, the argument passed in instead of H, to compute a preconditioner. B is a good choice because it is the same size as H and approximates H to some degree. If B were not the same size as H, quadprog would compute a preconditioner based on some diagonal scaling matrices determined from the algorithm. Typically, this would not perform as well.

Because the preconditioner is more approximate than when H is available explicitly, adjusting the TolPcg parameter to a somewhat smaller value might be required. This example is the same as the previous one, but reduces TolPcg from the default 0.1 to 0.01.

options = optimset('HessMult',mtxmpy,'TolPcg',0.01); 
[x,fval,exitflag,output]=quadprog(B,f,[],[],[],[],l,u,xstart,...
                                  options,A); 
Optimization terminated successfully: 
Relative function value changing by less than
sqrt(OPTIONS.TolFun), no negative curvature detected in Hessian
this iteration, and the rate of progress (change in f(x)) is slow

After 18 iterations and 50 PCG iterations, the function value has the same value to five significant digits

```
fval = 
-1.0538e+003
```

but the first-order optimality is further reduced.

```
output.firstorderopt = 
0.0028
```

Note Decreasing TolPcg too much can substantially increase the number of PCG iterations.

Quadratic Minimization with Bound Constraints Linear Least-Squares with Bound Constraints