15.2 Metric Multidimensional Scaling

Metric MDS begins with a distance matrix with elements where . The objective of metric MDS is to find a configuration of points in -dimensional space from the distances between the points such that the coordinates of the points along the dimensions yield a Euclidean distance matrix whose elements are as close as possible to the elements of the given distance matrix .

15.2.1 The Classical Solution

The classical solution is based on a distance matrix that is computed from a Euclidean geometry.

DEFINITION 15.1   A distance matrix is Euclidean if for some points .

The following result tells us whether a distance matrix is Euclidean or not.

THEOREM 15.1   Define and where is the centering matrix. is Euclidean if and only if is positive semidefinite. If is the distance matrix of a data matrix , then . is called the inner product matrix.

Recovery of coordinates

The task of MDS is to find the original Euclidean coordinates from a given distance matrix. Let the coordinates of points in a dimensional Euclidean space be given by () where . Call the coordinate matrix and assume . The Euclidean distance between the -th and -th points is given by:
 (15.1)

The general term of is given by:
 (15.2)

It is possible to derive from the known squared distances , and then from the unknown coordinates.
 (15.3)

Centering of the coordinate matrix implies that . Summing (15.3) over , over , and over and , we find:
 (15.4)

Solving (15.3) and (15.4) gives:
 (15.5)

With , and
 (15.6)

we get:
 (15.7)

Define the matrix as , and observe that:
 (15.8)

The inner product matrix can be expressed as:
 (15.9)

where is the matrix of coordinates. The rank of is then
 (15.10)

As required in Theorem 15.1 the matrix is symmetric, positive semidefinite and of rank , and hence it has non-negative eigenvalues and zero eigenvalues. can now be written as:
 (15.11)

where , the diagonal matrix of the eigenvalues of , and , the matrix of corresponding eigenvectors. Hence the coordinate matrix containing the point configuration in is given by:
 (15.12)

How many dimensions?

The number of desired dimensions is small in order to provide practical interpretations, and is given by the rank of or the number of nonzero eigenvalues . If is positive semidefinite, then the number of nonzero eigenvalues gives the number of eigenvalues required for representing the distances .

The proportion of variation explained by dimensions is given by

 (15.13)

It can be used for the choice of . If is not positive semidefinite we can modify (15.13) to
 (15.14)

In practice the eigenvalues are almost always unequal to zero. To be able to represent the objects in a space with dimensions as small as possible we may modify the distance matrix to:

 (15.15)

with
 (15.16)

where is determined such that the inner product matrix becomes positive semidefinite with a small rank.

Similarities

In some situations we do not start with distances but with similarities. The standard transformation (see Chapter 11) from a similarity matrix to a distance matrix is:
 (15.17)

THEOREM 15.2   If , then the distance matrix defined by (15.17) is Euclidean with centered inner product matrix .

Relation to Factorial Analysis

Suppose that the () data matrix is centered so that equals a multiple of the covariance matrix . Suppose that the eigenvalues of are distinct and non zero. Using the duality Theorem 8.4 of factorial analysis we see that are also eigenvalues of = when is the Euclidean distance matrix between the rows of . The -dimensional solution to the metric MDS problem is thus given by the first principal components of .

Optimality properties of the classical MDS solution

Let be a data matrix with some inter-point distance matrix . The objective of MDS is thus to find , a representation of in a lower dimensional Euclidean space whose inter-point distance matrix is not far from . Let be a orthogonal matrix where is . represents a projection of on the column space of ; in other words, may be viewed as a fitted configuration of in . A measure of discrepancy between and is given by
 (15.18)

THEOREM 15.3   Among all projections of onto -dimensional subspaces of the quantity in (15.18) is minimized when is projected onto its first principal factors.

We see therefore that the metric MDS is identical to principal factor analysis as we have defined it in Chapter 8.

Summary
Metric MDS starts with a distance matrix .
The aim of metric MDS is to construct a map in Euclidean space that corresponds to the given distances.
A practical algorithm is given as: