I.3.6.3: Block Optimization Methods

The resuilts in the previous two sections were for general bock modification methods. We now specialize to block relaxation methods for unconstrained differentiable optimization problems. The rate of convergence of block relaxation algorithms depends on the block structure, and on the matrix of second derivatives of the function we are minimizing.

The functions that update block are defined implicitly by From the implicit function theorem If we use this in the LU-form we find

If there are only two blocks the result simplifies to

Thus, in a local minimum, where the matrix of second derivatives is non-negative definite, we find that the largest eigenvalue of is like the largest squared canonical correlation of two sets of variables, and is consequently less than or equal to one.

We also see that a sufficient condition for local convergence to a stationary point of the algorithm is that This is always true for an isolated local minimum, because there the matrix of second derivatives is positive definite. If is singular at the solution , we find a canonical correlation equal to and we do not have guaranteed linear convergence.

For the product form we find that where has blocks of zeroes, except for block , which is the identity. Thus It follows that for all such that . Thus whenever is singular.

For two blocks which gives the same result as obtained from the LU-form.

The code in blockRate.R computes in one of four ways. We can use the analytical form of the Hessian or compute it numerically, and we can use the LU-form or the product form. If we compute the Hessian numerically we give the function we are minimizing as an argument, if we use the Hessian analytically we pass a function for evaluating it at .


[Insert blockRate.R Here](../code/blockRate.R)