8.1 The Geometric Point of View

As a matter of introducing certain ideas, assume that the data matrix $\data{X}(n\times p)$ is composed of $n$ observations (or individuals) of $p$ variables.

There are in fact two ways of looking at $\data{X}$, row by row or column by column:

(1)
Each row (observation) is a vector $x_i^{\top}=(x_{i1},\ldots,x_{ip}) \in \mathbb{R}^p$.

From this point of view our data matrix $\data{X}$ is representable as a cloud of $n$ points in $\mathbb{R}^p$ as shown in Figure 8.1.

Figure 8.1:
\includegraphics[width=1\defpicwidth]{fig351.ps}

(2)
Each column (variable) is a vector $ x_{\column{j}}
=(x_{1j} \ldots x_{nj})^{\top} \in \mathbb{R}^n $.

From this point of view the data matrix $\data{X}$ is a cloud of $p$ points in $\mathbb{R}^n$ as shown in Figure 8.2.

Figure 8.2:
\includegraphics[width=1\defpicwidth]{fig352.ps}

When $n$ and/or $p$ are large (larger than $2$ or $3$), we cannot produce interpretable graphs of these clouds of points. Therefore, the aim of the factorial methods to be developed here is two-fold. We shall try to simultaneously approximate the column space $C(\data{X})$ and the row space $C(\data{X}^{\top})$ with smaller subspaces. The hope is of course that this can be done without loosing too much information about the variation and structure of the point clouds in both spaces. Ideally, this will provide insights into the structure of $\data{X}$ through graphs in $\mathbb{R}$, $\mathbb{R}^2$ or $\mathbb{R}^3$. The main focus then is to find the dimension reducing factors.

Summary
$\ast$
Each row (individual) of $\data{X}$ is a $p$-dimensional vector. From this point of view $\data{X}$ can be considered as a cloud of $n$ points in $\mathbb{R}^p$.
$\ast$
Each column (variable) of $\data{X}$ is a $n$-dimensional vector. From this point of view $\data{X}$ can be considered as a cloud of $p$ points in $\mathbb{R}^n$.