9.3 Interpretation of the PCs

Recall that the main idea of PC transformations is to find the most informative projections that maximize variances. The most informative SLC is given by the first eigenvector. In Section 9.2 the eigenvectors were calculated for the bank data. In particular, with centered $x$'s, we had:

\begin{displaymath}\begin{array}{lrcl}
& y_1&=&-0.044 x_1+0.112 x_2+0.139 x_3+0....
...+ 0.066 x_3 - 0.563x_4 + 0.659x_5 -0.489x_6 \cr
\par\end{array}\end{displaymath}

and

\begin{displaymath}\begin{array}{lrcl}
&x_1&=& \textrm{length }\cr
&x_2&=& \text...
...textrm{top frame}\cr
&x_6&=& \textrm{diagonal. }\cr
\end{array}\end{displaymath}

Hence, the first PC is essentially the difference between the bottom frame variable and the diagonal. The second PC is best described by the difference between the top frame variable and the sum of bottom frame and diagonal variables.

The weighting of the PCs tells us in which directions, expressed in original coordinates, the best variance explanation is obtained. A measure of how well the first $q$ PCs explain variation is given by the relative proportion:

\begin{displaymath}
\psi_q = \displaystyle \frac{\displaystyle \sum ^q_{j=1} \la...
...}
{\displaystyle \sum^p_{j=1} \mathop{\mathit{Var}}(Y_{j})}.
\end{displaymath} (9.12)


Table 9.1: Proportion of variance of PC's
eigenvalue proportion of variance cumulated proportion
2.985 0.67 0.67
0.931 0.21 0.88
0.242 0.05 0.93
0.194 0.04 0.97
0.085 0.02 0.99
0.035 0.01 1.00


Referring to the bank data example 9.2, the (cumulative) proportions of explained variance are given in Table 9.1. The first PC $(q=1)$ already explains 67% of the variation. The first three $(q=3)$ PCs explain 93% of the variation. Once again it should be noted that PCs are not scale invariant, e.g., the PCs derived from the correlation matrix give different results than the PCs derived from the covariance matrix (see Section 9.5).

A good graphical representation of the ability of the PCs to explain the variation in the data is given by the scree plot shown in the lower righthand window of Figure 9.3. The scree plot can be modified by using the relative proportions on the $y$-axis, as is shown in Figure 9.5 for the bank data set.

Figure 9.5: Relative proportion of variance explained by PCs. 32052 MVApcabanki.xpl
\includegraphics[width=1\defpicwidth]{pcabanki.ps}

The covariance between the PC vector $Y$ and the original vector $X$ is calculated with the help of (9.4) as follows:

$\displaystyle \Cov(X,Y)$ $\textstyle =$ $\displaystyle E(XY^{\top}) - EX EY^{\top} = E(XY^{\top})$  
  $\textstyle =$ $\displaystyle E(XX^{\top}\Gamma)-\mu\mu^{\top}\Gamma= \Var(X)\Gamma$  
  $\textstyle =$ $\displaystyle \Sigma \Gamma$ (9.13)
  $\textstyle =$ $\displaystyle \Gamma \Lambda \Gamma ^{\top}\Gamma$  
  $\textstyle =$ $\displaystyle \Gamma \Lambda .$  
       

Hence, the correlation, $\rho_{X_{i}Y_{j}}$, between variable $X_i$ and the PC $Y_j$ is
\begin{displaymath}
\rho _{X_{i}Y_{j}}= {\frac{\gamma _{ij}\lambda _j }
{(\sigma...
...left (\frac{ \lambda _j}{\sigma _{X_{i}X_{i}} }\right)^{1/2}.
\end{displaymath} (9.14)

Using actual data, this of course translates into
\begin{displaymath}
r_{X_{i}Y_{j}}
=g_{ij} \left (\frac{ \ell_j}{s_{X_{i}X_{i}} }\right)^{1/2}.
\end{displaymath} (9.15)

The correlations can be used to evaluate the relations between the PCs $Y_j$ where $j=1,\dots,q$, and the original variables $X_i$ where $i = 1,\ldots ,p$. Note that
\begin{displaymath}
\sum_{j=1}^p r_{X_{i}Y_{j}}^2 = \frac{\sum_{j=1}^p
\ell_{j}g...
...^2}{s_{X_{i}X_{i}}}=\frac{s_{X_{i}X_{i}}}{s_{X_{i}X_{i}}}
=1.
\end{displaymath} (9.16)

Indeed, $\sum_{j=1}^p \ell_{j}g_{ij}^2 = g_{i}^{\top}\data{L}g_{i}$ is the $(i,i)$-element of the matrix $\data{G}\data{L}\data{G}^{\top} = \data{S}$, so that $r_{X_i Y_j}^2$ may be seen as the proportion of variance of $X_{i}$ explained by $Y_{j}$.

In the space of the first two PCs we plot these proportions, i.e., $r_{X_{i}Y_{1}}$ versus $r_{X_{i}Y_{2}}$. Figure 9.6 shows this for the bank notes example. This plot shows which of the original variables are most strongly correlated with PC $Y_1$ and $Y_{2}$.

Figure 9.6: The correlation of the original variable with the PCs. 32059 MVApcabanki.xpl
\includegraphics[width=1\defpicwidth]{banki2.ps}

From (9.16) it obviously follows that $r_{X_{i}Y_{1}}^2
+ r_{X_{i}Y_{2}}^2 \le 1$ so that the points are always inside the circle of radius $1$. In the bank notes example, the variables $X_{4}$, $X_{5}$ and $X_{6}$ correspond to correlations near the periphery of the circle and are thus well explained by the first two PCs. Recall that we have interpreted the first PC as being essentially the difference between $X_{4}$ and $X_{6}$. This is also reflected in Figure 9.6 since the points corresponding to these variables lie on different sides of the vertical axis. An analogous remark applies to the second PC. We had seen that the second PC is well described by the difference between $X_{5}$ and the sum of $X_{4}$ and $X_{6}$. Now we are able to see this result again from Figure 9.6 since the point corresponding to $X_{5}$ lies above the horizontal axis and the points corresponding to $X_{4}$ and $X_{6}$ lie below.


Table 9.2: Correlation between the original variables and the PCs
  $r_{X_iY_1}$ $r_{X_iY_2}$ $r^2_{X_iY_1} + r^2_{X_iY_2}$
$X_1$ length $-$0.201 0.028 0.041
$X_2$ left h. 0.538 0.191 0.326
$X_3$ right h. 0.597 0.159 0.381
$X_4$ lower 0.921 $-$0.377 0.991
$X_5$ upper 0.435 0.794 0.820
$X_6$ diagonal $-$0.870 $-$0.410 0.926


The correlations of the original variables $X_i$ and the first two PCs are given in Table 9.2 along with the cumulated percentage of variance of each variable explained by $Y_1$ and $Y_2$. This table confirms the above results. In particular, it confirms that the percentage of variance of $X_1$ (and $X_2,$ $X_3$) explained by the first two PCs is relatively small and so are their weights in the graphical representation of the individual bank notes in the space of the first two PCs (as can be seen in the upper left plot in Figure 9.3). Looking simultaneously at Figure 9.6 and the upper left plot of Figure 9.3 shows that the genuine bank notes are roughly characterized by large values of $X_6$ and smaller values of $X_4$. The counterfeit bank notes show larger values of $X_5$ (see Example 7.15).

Summary
$\ast$
The weighting of the PCs tells us in which directions, expressed in original coordinates, the best explanation of the variance is obtained. Note that the PCs are not scale invariant.
$\ast$
A measure of how well the first $q$ PCs explain variation is given by the relative proportion $\psi_q = \sum_{j=1}^q \lambda_{j} / \sum_{j=1}^p
\lambda_{j}$. A good graphical representation of the ability of the PCs to explain the variation in the data is the scree plot of these proportions.
$\ast$
The correlation between PC $Y_{j}$ and an original variable $X_{i}$ is $\rho _{X_{i}Y_{j}} =\gamma _{ij}\left (\frac{ \lambda _j}
{\sigma _{X_{i}X_{i}} }\right)^{1/2}$. For a data matrix this translates into $r_{X_{i}Y_{j}}^2=\frac{\ell _j g_{ij}^2 }{s_{X_{i}X_{i}} }$. $r_{X_{i}Y_{j}}^2$ can be interpreted as the proportion of variance of $X_{i}$ explained by $Y_{j}$. A plot of $r_{X_{i}Y_{1}}$ vs. $r_{X_{i}Y_{2}}$ shows which of the original variables are most strongly correlated with the PCs, namely those that are close to the periphery of the circle of radius 1.