15.2 Metric Multidimensional Scaling

Metric MDS begins with a $(n \times n)$ distance matrix $\data{D}$ with elements $d_{ij}$ where $i, j=1,\ldots,n$ . The objective of metric MDS is to find a configuration of points in -dimensional space from the distances between the points such that the coordinates of the points along the dimensions yield a Euclidean distance matrix whose elements are as close as possible to the elements of the given distance matrix $\data{D}$ .

15.2.1 The Classical Solution

The classical solution is based on a distance matrix that is computed from a Euclidean geometry.

DEFINITION 15.1 A $(n \times n)$ distance matrix $\data{D} = (d_{ij})$ is Euclidean if for some points $x_{1},\ldots, x_n \in \mathbb{R}^p; d^2_{ij} = (x_i - x_j)^{\top} (x_i - x_j)$ .

The following result tells us whether a distance matrix is Euclidean or not.

THEOREM 15.1 Define $\data{A}=(a_{ij}), a_{ij} = -\frac{1}{2}d_{ij}^2$ and $\data{B}=\data{H}\data{A}\data{H}$ where $\data{H}$ is the centering matrix. $\data{D}$ is Euclidean if and only if $\data{B}$ is positive semidefinite. If $\data{D}$ is the distance matrix of a data matrix $\data{X}$ , then $\data{B}=\data{H}\data{X}\data{X}^{T}\data{H}$ . $\data{B}$ is called the inner product matrix.

Recovery of coordinates

The task of MDS is to find the original Euclidean coordinates from a given distance matrix. Let the coordinates of

points in a

dimensional Euclidean space be given by

( $i=1,\ldots ,n$ ) where $x_i=(x_{i1},\ldots,x_{ip})^{\top}$ . Call ${\data{X}}=(x_1,\dots,x_n)^{\top}$ the coordinate matrix and assume $\overline{x}=0$ . The Euclidean distance between the

-th and

-th points is given by:

$\begin{displaymath} d_{ij}^2 = \sum_{k = 1}^p (x_{ik}-x_{jk})^2. \end{displaymath}$

(15.1)

The general $b_{ij}$ term of $\data{B}$ is given by:

$\begin{displaymath} b_{ij} = \sum_{k=1}^p x_{ik} x_{jk} =x_i^{\top} x_j. \end{displaymath}$

(15.2)

It is possible to derive $\data{B}$ from the known squared distances $d_{ij}$ , and then from $\data{B}$ the unknown coordinates.

$\displaystyle d_{ij}^2$	$\textstyle =$	$\displaystyle x_i^{\top} x_i + x_j^{\top} x_j - 2x_i^{\top} x_j$
	$\textstyle =$	$\displaystyle b_{ii} + b_{jj} -2b_{ij}.$	(15.3)

Centering of the coordinate matrix ${\data{X}}$ implies that $\sum_{i=1}^n b_{ij}=0$ . Summing (15.3) over

, over

, and over

and

, we find:

$\displaystyle \frac{1}{n}\sum_{i=1}^n d_{ij}^2$	$\textstyle =$	$\displaystyle \frac{1}{n}\sum_{i=1}^n b_{ii} + b_{jj}$
$\displaystyle \frac{1}{n}\sum_{j=1}^n d_{ij}^2$	$\textstyle =$	$\displaystyle b_{ii} + \frac{1}{n}\sum_{j=1}^n b_{jj}$
$\displaystyle \frac{1}{n^2}\sum_{i=1}^n \sum_{j=1}^n d_{ij}^2$	$\textstyle =$	$\displaystyle \frac{2}{n} \sum_{i=1}^n b_{ii}.$	(15.4)

Solving (15.3) and (15.4) gives:

$\begin{displaymath} b_{ij} = -\frac{1}{2}(d_{ij}^2-d_{i \bullet}^2 -d_{\bullet j}^2+d_{\bullet \bullet}^2). \end{displaymath}$

(15.5)

With $a_{ij}=-\frac{1}{2}d_{ij}^2$ , and

$\displaystyle a_{i \bullet}$	$\textstyle =$	$\displaystyle \frac{1}{n}\sum_{j=1}^n a_{ij}$
$\displaystyle a_{\bullet j}$	$\textstyle =$	$\displaystyle \frac{1}{n}\sum_{i=1}^n a_{ij}$
$\displaystyle a_{\bullet \bullet}$	$\textstyle =$	$\displaystyle \frac{1}{n^2}\sum_{i=1}^n\sum_{j=1}^n a_{ij}$	(15.6)

we get:

$\begin{displaymath} b_{ij} = a_{ij}-a_{i \bullet} - a_{\bullet j} + a_{\bullet \bullet}. \end{displaymath}$

(15.7)

Define the matrix $\data{A}$ as $(a_{ij})$ , and observe that:

$\begin{displaymath} \data{B}=\data{H}\data{A}\data{H}. \end{displaymath}$

(15.8)

The inner product matrix $\data{B}$ can be expressed as:

$\begin{displaymath} \data{B} = \data{X}\data{X}^{\top}, \end{displaymath}$

(15.9)

where $\data{X}=(x_1, \ldots ,x_n)^{\top}$ is the $(n\times p)$ matrix of coordinates. The rank of $\data{B}$ is then

$\begin{displaymath} rank(\data{B})=rank(\data{X}\data{X}^{\top})=rank(\data{X}) = p. \end{displaymath}$

(15.10)

As required in Theorem 15.1 the matrix $\data{B}$ is symmetric, positive semidefinite and of rank

, and hence it has

non-negative eigenvalues and

zero eigenvalues. $\data{B}$ can now be written as:

$\begin{displaymath} \data{B}=\Gamma \Lambda \Gamma^{\top} \end{displaymath}$

(15.11)

where $\Lambda = \mathop{\hbox{diag}}(\lambda_1 ,\ldots,\lambda_p)$ , the diagonal matrix of the eigenvalues of $\data{B}$ , and $\Gamma = (\gamma_1 ,\ldots,\gamma_p )$ , the matrix of corresponding eigenvectors. Hence the coordinate matrix $\data{X}$ containing the point configuration in $\mathbb{R}^p$ is given by:

$\begin{displaymath} \data{X} = \Gamma \Lambda^{\frac{1}{2}}. \end{displaymath}$

(15.12)

How many dimensions?

The number of desired dimensions is small in order to provide practical interpretations, and is given by the rank of ${\data{B}}$ or the number of nonzero eigenvalues $\lambda_i$ . If $\data{B}$ is positive semidefinite, then the number of nonzero eigenvalues gives the number of eigenvalues required for representing the distances $d_{ij}$ .

The proportion of variation explained by dimensions is given by

$\begin{displaymath} \frac{\sum_{i=1}^p \lambda_{i}}{\sum_{i=1}^{n-1} \lambda_{i}}. \end{displaymath}$

(15.13)

It can be used for the choice of

. If $\data{B}$ is not positive semidefinite we can modify (15.13) to

$\begin{displaymath} \frac{\sum_{i=1}^p \lambda_{i}}{\sum (\textrm{\lq\lq positive eigenvalues''})}. \end{displaymath}$

(15.14)

In practice the eigenvalues $\lambda_{i}$ are almost always unequal to zero. To be able to represent the objects in a space with dimensions as small as possible we may modify the distance matrix to:

$\begin{displaymath} \data{D^*} = d^*_{ij} \end{displaymath}$

(15.15)

with

$\begin{displaymath} d^*_{ij} = \left\{ \begin{array}{lr} 0 & ;i=j\\ d_{ij}+e \geq 0 &;i \neq j \end{array} \right. \end{displaymath}$

(15.16)

where

is determined such that the inner product matrix $\data{B}$ becomes positive semidefinite with a small rank.

Similarities

In some situations we do not start with distances but with similarities. The standard transformation (see Chapter 11) from a similarity matrix $\data{C}$ to a distance matrix $\data{D}$ is:

$\begin{displaymath} d_{ij} = (c_{ii}-2c_{ij}+c_{jj})^\frac{1}{2}. \end{displaymath}$

(15.17)

THEOREM 15.2 If $\data{C} \leq \ 0$ , then the distance matrix $\data{D}$ defined by (15.17) is Euclidean with centered inner product matrix $\data{B}=\data{H}\data{C}\data{H}$ .

Relation to Factorial Analysis

Suppose that the ( $n\times p$ ) data matrix $\data{X}$ is centered so that $\data{X}^{\top}\data{X}$ equals a multiple of the covariance matrix $n\data{S}$ . Suppose that the

eigenvalues $\lambda_1,\ldots,\lambda_p$ of $n\data{S}$ are distinct and non zero. Using the duality Theorem 8.4 of factorial analysis we see that $\lambda_1,\ldots,\lambda_p$ are also eigenvalues of $\data{X}\data{X}^{\top}$ = $\data{B}$ when $\data{D}$ is the Euclidean distance matrix between the rows of $\data{X}$ . The

-dimensional solution to the metric MDS problem is thus given by the

first principal components of $\data{X}$ .

Optimality properties of the classical MDS solution

Let $\data{X}$ be a $(n\times p)$ data matrix with some inter-point distance matrix $\data{D}$ . The objective of MDS is thus to find $\data{X}_1$ , a representation of $\data{X}$ in a lower dimensional Euclidean space $\mathbb{R}^{k}$ whose inter-point distance matrix $\data{D}_1$ is not far from $\data{D}$ . Let $\data{L} = (\data{L}_{1}, \data{L}_{2})$ be a $(p \times p)$ orthogonal matrix where $\data{L}_1$ is $(p\times k)$ . $\data{X}_1=\data{XL}_1$ represents a projection of $\data{X}$ on the column space of $\data{L}_1$ ; in other words, $\data{X}_1$ may be viewed as a fitted configuration of $\data{X}$ in $\mathbb{R}^k$ . A measure of discrepancy between $\data{D}$ and $\data{D}_1 =(d_{ij}^{(1)})$ is given by

$\begin{displaymath} \phi = \sum_{i,j=1}^n (d_{ij}-d_{ij}^{(1)})^2 . \end{displaymath}$

(15.18)

THEOREM 15.3 Among all projections $\data{XL}_{1}$ of $\data{X}$ onto

-dimensional subspaces of $\mathbb{R}^p$ the quantity $\phi$ in (15.18) is minimized when $\data{X}$ is projected onto its first

principal factors.

We see therefore that the metric MDS is identical to principal factor analysis as we have defined it in Chapter 8.

Summary

$\ast$

Metric MDS starts with a distance matrix $\data{D}$ .

$\ast$

The aim of metric MDS is to construct a map in Euclidean space that corresponds to the given distances.

$\ast$

A practical algorithm is given as:

start with distances $d_{ij}$
define $\data{A}=-\frac{1}{2}d_{ij}^2$
put $\data{B}=\left(a_{ij}-a_{i\bullet}-a_{\bullet j}+a_{\bullet \bullet}\right)$
find the eigenvalues $\lambda_1,\ldots,\lambda_p$ and the associated eigenvectors $\gamma_1 ,\ldots,\gamma_p$ where the eigenvectors are normalized so that $\gamma_i^{\top} \gamma_i = \lambda_i$ .
Choose an appropriate number of dimensions (ideally )
The coordinates of the points in the Euclidean space are given by $x_{ij} = \gamma_{ij}\lambda_j^{1/2}$ for $i=1,\dots,n$ and $j=1,\dots,p$ .

$\ast$

Metric MDS is identical to principal components analysis.