I.6.5.2: Optimal Scaling with ORDINALS
In LINEALS (section x.x.x) we try to find quantifications of the variables that linearize all bivariate regressions. De Leeuw [1988] has suggested to find standardized quantifications in such a way that the loss function f(y)=∑∑j≠ℓ{y′jC jlD−1ℓC ℓjy j−y′jC jly ℓy′ℓC ℓjy j} is minimized.
A more general loss function is g(y,z)=∑∑j≠ℓ(zjl−D−1jCjℓyℓ)′Dj(zjl−D−1jCjℓyℓ), which must be minimized over both y and z. The zjl are m(m−1) vectors, called regression targets, and target zjl has kj elements.
To see that this loss function generalizes (1) suppose we constrain z by requiring that zjℓ is proportional to yj, i.e. zjℓ=rjlyj. Then, using y′jDjyj=1, g(y,R)=∑∑j≠ℓr2jl−2∑∑j≠ℓrjℓy′jCjℓyℓ+∑∑j≠ℓy′ℓCℓjD−1jCjℓyℓ. This is minimized over R by rjℓ=y′jCjℓyℓ, and the minimum is precisely the loss function (1). Thus f(y)=minRg(y,R), and g is an augmentation of f. Block relaxation for g alternates minimization over R for fixed y, which we have shown to be easy, and minimization over y for fixed R, which is a modified eigenvalue problem of the kind discussed in BRAS3, section x.x.x. This is not necessarily simpler than the direct minimum eigenvalue problem for minimizing f in section x.x.x.
The major advantage from augmenting f is that it now becomes simple to incorporate quite general restrictions on the zjℓ. For example, they can be required to be monotone with the original data, or a spline transformation, or a monotone spline. Or a mixture of these options. Thus we can constrain each individual regression functions D−1jCjℓyℓ to have one of a pre-determined number of shapes.
In ordinals.R
we implement the three standard options of the Gifi system. A vector yj is treated as nominal, ordinal, or numerical.
If it is nominal then it is unconstrained, except for the normalization.
In that case the zjℓ are also unconstrained for all ℓ.
If yj is treated as ordinal is must be monotone with the data, and so must all zjℓ. And a numerical yj must be linear with the data, together with its targets zjℓ. Of course if all variables are numerical there is nothing to optimize, and we just compute correlations. If all variables are nominal there is nothing to optimize either, because we immediately get zero loss from any starting point.