I.6.6.2: Multidimensional Unfolding
Now a data analysis example. In least-squares-squared metric unfolding (LSSMU) we must minimize over the and configuration matrices and . This has been typically handled by block decomposition. The unknowns are partitioned into a number of subsets. Block relaxation algorithms then cycle through the subsets, minimizing over the parameters in the subset while keeping all parameters fixed at their current values. One cycle through the subsets is one iteration of the algorithm.
In ALSCAL (Takane et all [1977]) coordinate descent is used, which means that the blocks consist of a single coordinate. There are blocks. Solving for the optimal coordinate, with all other fixed, means minimizing a quartic, which in turn means finding the roots of a cubic. The algorithm converges to a stationary point which is a global minimum with respect to each coordinate separately. An alternative algorithm, proposed by Browne [1987], uses the points as blocks. Each substep is again an easy unidimensional minimization. Their algorithm converges to a stationary point which is a global minimum with respect to each point. Generally it is considered to be desirable to have fewer blocks, both to increase the speed of convergence and to restrict the class of local minima we can converge to.
Let us use our basic theorem to construct a four-block algorithm for LSSMU. Minimizing~\eqref{E:sstress} is the same as minimizing over and , where the configuration matrices and are constrained by and .
The algorithm starts with values satisfying the constraints. Suppose we have arrived at . We then update This gives . It is understood that in each of the four substeps of~\eqref{E:alg} we compute the global minimum, and if the global minimum happens to be nonunique we select any of them. We also remark that, as with any block relaxation method having more than two blocks, there are many variations on this basic scheme. We can travel through the substeps in a different order, we can change the order in each cycle, we can pass through the substeps in random order, we can cycle through the first two substeps a number of times before going to the third and fourth, and so on. Each of these strategies has its own overall convergence rate, and further research would be needed to determine what is best.
Let us look at the subproblems a bit more in detail to see how they can be best solved. Expanding~\eqref{E:expand} and organizing terms by powers of gives where . This is a sum of univariate quartic polynomials, which can be minimized separately to give the global minimum over . Obviously the same applies to minimization over .
For minimization over and we define Then Expanding and collecting terms gives with and Again this is the sum of separate functions , quadratics in this case, which can be minimized separately for each . By symmetry, we have the same strategy to minimize over .
Mimizing over , under the constraint , leads to the secular equation problem discussed in the Appendix. Since typically is two or at most three, the subproblems are very small indeed and can be solved efficiently.