Metric MDS begins with a distance matrix with
elements where
.
The objective of metric MDS is to find a configuration of points in
-dimensional space from the distances between the points such that the
coordinates of the points along the dimensions
yield a Euclidean
distance matrix whose elements are as close as possible
to the elements of the given distance matrix .
15.2.1 The Classical Solution
The classical solution is based on a distance matrix that
is computed
from a Euclidean geometry.
DEFINITION 15.1
A
distance matrix
is Euclidean if for some points
.
The following result tells us whether a distance matrix is Euclidean or
not.
THEOREM 15.1
Define
and
where
is the centering matrix.
is Euclidean if and only if
is positive semidefinite.
If
is the distance matrix of a data matrix
, then
.
is called the inner product matrix.
The task of MDS is to find the original Euclidean coordinates from
a given distance matrix.
Let the coordinates of points in a dimensional Euclidean
space be given by ()
where
.
Call
the coordinate matrix and assume
.
The Euclidean distance between the -th and -th points is given by:
|
(15.1) |
The general term of is given by:
|
(15.2) |
It is possible to derive from the known squared distances
, and then from
the unknown coordinates.
Centering of the coordinate matrix implies that
. Summing (15.3) over , over , and over and , we find:
Solving (15.3) and (15.4) gives:
|
(15.5) |
With
, and
we get:
|
(15.7) |
Define the matrix as , and observe that:
|
(15.8) |
The inner product matrix can be expressed as:
|
(15.9) |
where
is the matrix
of coordinates.
The rank of is then
|
(15.10) |
As required in Theorem 15.1 the matrix is symmetric,
positive semidefinite and of
rank , and hence it has non-negative eigenvalues and zero
eigenvalues.
can now be written as:
|
(15.11) |
where
, the diagonal
matrix of the eigenvalues of
, and
, the matrix
of corresponding eigenvectors.
Hence the coordinate matrix containing the point
configuration in is given by:
|
(15.12) |
The number of desired dimensions is small in order to
provide practical interpretations, and
is given by the rank of or the number of nonzero
eigenvalues .
If is positive semidefinite, then the
number of nonzero eigenvalues gives the number of eigenvalues required for
representing the distances .
The proportion of variation explained by dimensions is given by
|
(15.13) |
It can be used for the choice of .
If is not positive semidefinite
we can modify (15.13) to
|
(15.14) |
In practice the eigenvalues are almost always unequal to zero.
To be able to represent the objects in a space with dimensions
as small as possible we may modify the distance matrix to:
|
(15.15) |
with
|
(15.16) |
where is determined such that the inner product matrix becomes
positive semidefinite with a small rank.
In some situations we do not start with distances but
with similarities.
The standard transformation (see Chapter 11)
from a similarity matrix to a distance matrix
is:
|
(15.17) |
THEOREM 15.2
If
, then the distance matrix
defined by
(
15.17)
is Euclidean with centered inner product matrix
.
Suppose that the () data matrix is centered so
that
equals a multiple of the covariance matrix .
Suppose that the eigenvalues
of
are distinct and non zero. Using the duality
Theorem 8.4 of factorial analysis
we see that
are also eigenvalues of
=
when is the Euclidean distance matrix
between the rows of . The -dimensional solution
to the metric MDS problem is thus given by the first principal components
of .
Let be a data matrix with some inter-point
distance matrix . The objective of MDS is thus to find
, a representation of in a lower dimensional
Euclidean space
whose inter-point distance matrix
is not far from . Let
be a orthogonal matrix where
is .
represents a projection
of on the column space of ; in other words,
may be viewed as a fitted configuration of in
. A measure of discrepancy between and
is given by
|
(15.18) |
THEOREM 15.3
Among all projections
of
onto
-dimensional
subspaces of
the quantity
in (
15.18) is
minimized when
is projected onto its first
principal factors.
We see therefore that the metric MDS is identical to principal factor analysis
as we have defined it in Chapter 8.
Summary
-
Metric MDS starts with a distance matrix .
-
The aim of metric MDS is to construct a map in Euclidean space that corresponds
to the given distances.
-
A practical algorithm is given as:
- start with distances
- define
- put
- find the eigenvalues
and the associated
eigenvectors
where the eigenvectors are normalized
so that
.
- Choose an appropriate number of dimensions (ideally )
- The coordinates of the points in the Euclidean space are given by
for and .
-
Metric MDS is identical to principal components analysis.