The associations between two sets of variables may be identified and quantified
by canonical correlation analysis. The technique was
originally developed by Hotelling (1935) who analyzed how
arithmetic speed and arithmetic power are related to reading
speed and reading power. Other examples are the relation between
governmental policy variables and economic performance variables and
the relation between job and company characteristics.
Suppose we are given two random variables
and
.
The idea is to find an index describing a (possible) link between
and
.
Canonical correlation analysis (CCA) is based on linear indices, i.e., linear
combinations
of the random variables.
Canonical correlation analysis searches for vectors
and
such that the relation of the
two indices
and
is quantified in some interpretable way.
More precisely, one is looking for the ``most interesting''
projections
and
in the sense that they maximize
the correlation
 |
(14.1) |
between the two indices.
Let us consider the correlation
between the two projections
in more detail. Suppose that
where the sub-matrices of this covariance structure are given by
Using (3.7) and (4.26),
 |
(14.2) |
Therefore,
for any
.
Given the invariance of scale we may rescale projections
and
and thus we can equally solve
under the constraints
For this problem, define
 |
(14.3) |
Recall the singular value decomposition of
from
Theorem 2.2.
The matrix
may be decomposed as
with
where by (14.3) and (2.15),
and
are the
nonzero eigenvalues of
and
and
and
are the standardized
eigenvectors of
and
respectively.
Define now for
the vectors
which are called the canonical correlation vectors.
Using these canonical correlation vectors we define the
canonical correlation variables
The quantities
for
are called the canonical correlation coefficients.
From the properties of the singular value decomposition given in
(14.4) we have
 |
(14.9) |
The same is true for
.
The following theorem tells us that the canonical correlation vectors are the solution to the
maximization problem of (14.1).
THEOREM 14.1
For any given

,

, the maximum
 |
(14.10) |
subject to
and
is given by
and is attained when

and

.
PROOF:
The proof is given in three steps.
(i) Fix
and maximize over
, i.e., solve:
subject to
.
By Theorem 2.5 the maximum is given by the largest
eigenvalue of the matrix
By Corollary 2.2, the only nonzero eigenvalue equals
 |
(14.11) |
(ii) Maximize (14.11) over
subject to the
constraints of the Theorem. Put
and observe that (14.11) equals
Thus, solve the equivalent problem
 |
(14.12) |
subject to
,
for
.
Note that the
's are the eigenvectors of
corresponding to its first
largest eigenvalues. Thus, as
in Theorem 9.3, the maximum in (14.12) is
obtained by setting
equal to the eigenvector corresponding
to the
-th largest eigenvalue, i.e.,
or equivalently
. This yields
(iii)
Show that the maximum is attained for
and
. From the
SVD of
we conclude that
and hence
Let
The canonical correlation vectors
maximize the correlation between the canonical
variables
The covariance of the canonical variables
and
is given
in the next theorem.
THEOREM 14.2
Let

and

be the

-th canonical correlation variables
(

). Define

and

. Then
with

given in (
14.4).
This theorem shows that the canonical correlation coefficients,
, are the covariances
between the canonical variables
and
and that
the indices
and
have the
maximum covariance
.
The following theorem shows that canonical correlations are
invariant w.r.t. linear transformations of the original variables.
THEOREM 14.3
Let

and

where

and

are nonsingular matrices.
Then the canonical correlations between

and

are the same as
those between

and

.
The canonical correlation vectors of

and

are given by
Summary

-
Canonical correlation analysis aims to identify possible links
between two (sub-)sets of variables
and
.
The idea is to find indices
and
such that the
correlation
is maximal.

-
The maximum correlation (under constraints) is attained by setting
and
, where
and
denote the eigenvectors of
and
,
respectively.

-
The vectors
and
are called canonical correlation vectors.

-
The indices
and
are called
canonical correlation variables.

-
The values
,
which are the square
roots of the nonzero eigenvalues of
and
, are called the canonical correlation
coefficients. The covariance between the canonical correlation variables is
,
.

-
The first canonical variables,
and
, have the maximum covariance
.

-
Canonical correlations are invariant w.r.t. linear transformations of
the original variables
and
.