I.4.3.5: Squared Distance Scaling
In ALSCAL [Takane, Young, De Leeuw, 1977] we find multidimensional scaling solutions by minimizing the loss function sstress, defined by σ2(X)Δ=12n∑i=1n∑j=1wij(δij−d2ij(X))2. Thus known dissimilarities δij are approximated by squared Euclidean distances between points, which have coordinates in an n×p matrix X. Both dissimilarities Δ={δij} and the weights W={wij} are non-negative, symmetric, and hollow. Thus d2ij(X)=p∑s=1(xis−xjs)2.
Takane et al. discuss different block relaxation approaches to minimizing σ2. Because the loss function is a multivariate quartic the stationary equations obtained by setting the partials equal to zero are a system of np cubic equations in np unknowns. So, at least theoretically, we could use algebraic methods to solve the stationary equations and find the global minimum of sstress. This corresponds with the case in which there is only a single block of coordinates, but in the ALSCAL context other blocks are introduced by optimally transforming the dissimilarities and by incorporating weights for individual difference MDS models. At the other extreme of the block relaxation spectrum we could introduce np blocks for the np coordinates, which means we would use coordinate descent. Takane et al. ultimately decide to use a generalized block relaxation method with n blocks of the p coordinates of a single point, with a safeguarded Newton method used to minimize over a single block.
In this section we study coordinate descent in detail. If we modify coordinate (k,t) then only the squared distances d2ik and d2ki with i≠k will change. Adding θ to xkt with change d2ik to d2ik−2θ(xit−xkt)+θ2. Thus the part of sstress that depends on θ is n∑i=1wkj((δkj−d2ik)+2θ(xit−xkt)−θ2)2. Differentiating this and setting the derivative to zero gives the cubic equation n∑i=1wkj((δkj−d2ik)+2θ(xit−xkt)−θ2)((xit−xkt)−θ)=0 n∑i=1wkj(δkj−d2ik)(xit−xkt)+2θn∑i=1wkj(xit−xkt)2−θ2n∑i=1wkj(xit−xkt)−θn∑i=1wkj(δkj−d2ik)+2θ2n∑i=1wkj(xit−xkt)−θ3n∑i=1wkj