In practice, PCs are computed from sample data.
The following theorem yields results on the asymptotic distribution
of the sample PCs.
EXAMPLE 9.4
Since
if
are drawn from
, we have that
|
(9.17) |
Since the variance of (
9.17) depends on the true mean
a log transformation is useful. Consider
. Then
and by the
Transformation Theorem
4.11 we have from (
9.17) that
|
(9.18) |
Hence,
and a two-sided confidence interval at the
significance level is given by
In the bank data example we have that
Therefore,
It can be concluded for the true eigenvalue that
The variance explained by the first PCs
is given by
In practice this is estimated by
From Theorem 9.4 we know the distribution of
.
Since is a nonlinear function of , we can again
apply the Transformation Theorem 4.11 to obtain that
where
(from Theorem 9.4) and
with
Given this result, the following theorem can be derived.
EXAMPLE 9.5
From Section
9.3 it is known that the first PC for the Swiss
bank notes resolves 67% of the variation. It can be tested whether
the true proportion is actually 75%. Computing
Hence, a confidence interval at a significance of level
0.95 is given by
Clearly the hypothesis that
75% can be rejected!
Summary
-
The eigenvalues and eigenvectors are
asymptotically, normally distributed, in particular
.
-
For the eigenvalues it holds that
.
-
Given an asymptotic, normal distribution approximate confidence
intervals and tests can be constructed
for the proportion of variance which is explained by the first PCs.
The two-sided confidence interval at the
level is given by
-
It holds for , the estimate of (the proportion
of the variance explained by the first PCs) that
, where is given in Theorem 9.5.