Multidimensional Scaling
Many of the examples in the book are taken from the area of multidimensional scaling (MDS). In this appendix we describe the basic MDS notation and terminology. Our approach to MDS is based on Kruskal [1964ab], using terminology and notation of De Leeuw [1977] and De Leeuw and Heiser [1982]. For a more recent and more extensive discussion of MDS see Borg and Groenen [2005].
The data in an MDS problem consist of information about the dissimilarities between pairs of objects. Dissimilarities are like distances, in the sense that they give some information about physical or psychological closeness, but they need not satisfy any of the distance axioms. In metric MDS the dissimilarity between objects and is a given number , usually positive and symmetric, with possibly some of the dissimilarities missing. In non-metric MDS we only have a partial order on some or all of the dissimilarities. We want to represent the objects as points in a metric space in such a way that the distances between the points approximate the dissimilarities between the objects.
An MDS loss function is typically of the form for some norm, or pseudo-norm, on the space of matrices. Here are the points in the metric space, with the symmetric, non-negative, and hollow matrix of distances. The MDS problem is to minimize loss over all mappings and all feasible . In the metric MDS problems is fixed at the observed data, in non-metric MDS any monotone transformation of is feasible.
The definition of MDS we have given leaves room for all kinds of metric spaces and all kinds of norms to measure loss. In almost all applications both in this book and elsewhere, we are interested in Euclidean MDS, where the metric space is , and in loss functions that use the (weighted) sum of squares of residuals . Thus the loss function has the general form where is an matrix called the configuration.
The most popular choices for the residuals are Here and are elementwise transformations of the dissimilarities, with corresponding transformations and of the distances. In we use the centering operator . For Euclidean distances, and centered , with . Metric Euclidean MDS, using with unit weights, means finding the best rank approximation to , which can be done finding the dominant eigenvalues and corresponding eigenvectors. This is also known as Classical MDS [Torgerson, 1958].
The loss function that uses is called stress [Kruskal, 1964ab], the function that uses is sstress [Takane et al, 1977], and loss that uses is strain [De Leeuw and Heiser, 1982]. has been nameless so far, but it has been proposed by Ramsay [1977]. Because of its limiting properties (see below), we will call it strull.
Both ant are obviously special cases of for which the corresponding loss function is called r-stress. Because we see that is a limiting case of .
There is some matrix notation that is useful in dealing with Euclidean MDS. Suppose and are unit vectors, with all elements equal to zero, except one element which is equal to one. Then where and . If we define and then , which allows us to work with vectors in instead of matrices in .