I.5.3: Convergence Rate of ALS
The least squares loss function, in the most general form we consider here, is with a fixed symmetric positive semi-definite matrix of weights.
Thus and
Note: Older piece follows (fix) 03/12/15
It is easy to apply the general results from the previous sections to ALS. The results show that it is important that the solutions to the subproblems are unique. The least squares loss function has some special structure in its second derivatives which we can often exploit in a detailed analysis. If then with and the Jacobians of and and with and weighted sums of the Hessians of the and with weights equal to the least squares residuals at the solution. If and are small, because the residuals are small, or because the and are linear or almost linear, we see that the rate of ALS will be the canonical correlation between and