9.7 Common Principal Components

In many applications a statistical analysis is simultaneously done for groups of data. In this section a technique is presented that allows us to analyze group elements that have common PCs. From a statistical point of view, estimating PCs simultaneously in different groups will result in a joint dimension reducing transformation. This multi-group PCA, the so called common principle components analysis (CPCA), yields the joint eigenstructure across groups.

In addition to traditional PCA, the basic assumption of CPCA is that the space spanned by the eigenvectors is identical across several groups, whereas variances associated with the components are allowed to vary.

More formally, the hypothesis of common principle components can be stated in the following way (Flury; 1988): \begin{equation*}
H_{CPC}:\Sigma _{i}=\Gamma \Lambda _{i}\Gamma^{\top},\qquad i=1,...,k
\end{equation*} where $\Sigma_{i}$ is a positive definite $p \times p$ population covariance matrix for every $i$, $\Gamma=(\gamma_1, ..., \gamma_p)$ is an orthogonal $p \times p$ transformation matrix and $ \Lambda _{i}=\ $diag $\left( \lambda _{i1},...,\lambda _{ip}\right)$ is the matrix of eigenvalues. Moreover, assume that all $\lambda_{i}$ are distinct.

Let ${\data {S}}$ be the (unbiased) sample covariance matrix of an underlying $p$-variate normal distribution $N_{p}(\mu,\Sigma)$ with sample size $n$. Then the distribution of $nS$ has $n-1$ degrees of freedom and is known as the Wishart distribution (Muirhead; 1982, p. 86): \begin{equation*}
nS\sim \mathcal{W}_{p}(\Sigma,n-1).
\end{equation*} The density is given in (5.16). Hence, for a given Wishart matrix ${\data{S}}_{i}$ with sample size $n_i$, the likelihood function can be written as

\begin{displaymath}
L\left( \Sigma _{1},...,\Sigma _{k}\right) = C\overset{k}{\u...
...r\}
\left\vert \Sigma _{i}\right\vert ^{-\frac{1}{2}(n_{i}-1)}
\end{displaymath} (9.41)

where $C$ is a constant independent of the parameters $\Sigma_i$. Maximizing the likelihood is equivalent to minimizing the function \begin{equation*}
g(\Sigma _{1},...,\Sigma _{k})=\sum_{i=1}^{k}(n_{i}-1)\Bigl\{\...
...gma _{i}\vert+\text{tr} (\Sigma
_{i}^{-1}{\data{S}}_{i})\Bigl\}.
\end{equation*}

Assuming that $H_{CPC}$ holds, i.e., in replacing $\Sigma_i$ by $\Gamma \Lambda _{i}\Gamma^{\top}$, after some manipulations one obtains \begin{equation*}
g(\Gamma ,\Lambda _{1},...,\Lambda _{k})=\sum_{i=1}^{k}(n_{i}-...
...ma _{j}^{\top}{\data{S}}_{i}\gamma _{j}}{\lambda _{ij}}\right) .
\end{equation*}

As we know from Section 2.2, the vectors $\gamma_j$ in $\Gamma$ have to be orthogonal. Orthogonality of the vectors $\gamma_j$ is achieved using the Lagrange method, i.e., we impose the $p$ constraints $\gamma _{j}^{\top}\gamma _{j}=1$ using the Lagrange multipliers $\mu _{j},$ and the remaining $p(p-1)/2$ constraints $\gamma _{h}^{\top}\gamma
_{j}=0$ for $h\neq j$ using the multiplier $2\mu _{hj}$ (Flury; 1988). This yields \begin{equation*}
g^{\ast }(\Gamma ,\Lambda _{1},...,\Lambda _{k}) =g(\cdot
)-\s...
...\sum_{h=1}^p\sum_{j=h+1}^{p}\mu_{hj}\gamma_{h}^{\top}\gamma_{j}.
\end{equation*} Taking partial derivatives with respect to all $\lambda _{im}$ and $\gamma _{m}$, it can be shown that the solution of the CPC model is given by the generalized system of characteristic equations

\begin{displaymath}
\gamma _{m}^{\top}\left( \sum_{i=1}^{k}(n_{i}-1)\frac{\lambd...
...S}}_{i}\right) \gamma _{j}=0,\qquad m,j=1,...,p,\quad m\neq j.
\end{displaymath} (9.42)

This system can be solved using \begin{equation*}
\lambda _{im}=\gamma _{m}^{\top}{\data{S}}\gamma _{m},\qquad i=1,...,k,\quad m=1,...,p
\end{equation*} under the constraints \begin{equation*}
\gamma _{m}^{\top}\gamma _{j}=\begin{cases}0& \qquad m \neq j\\
1& \qquad m = j \end{cases}.
\end{equation*} Flury (1988) proves existence and uniqueness of the maximum of the likelihood function, and Flury and Gautschi (1986) provide a numerical algorithm.

EXAMPLE 9.7   As an example we provide the data sets XFGvolsurf01 , XFGvolsurf02 and XFGvolsurf03 that have been used in Fengler et al. (2001) to estimate common principle components for the implied volatility surfaces of the DAX 1999. The data has been generated by smoothing an implied volatility surface day by day. Next, the estimated grid points have been grouped into maturities of $\tau=1$, $\tau=2$ and $\tau=3$ months and transformed into a vector of time series of the ``smile", i.e., each element of the vector belongs to a distinct moneyness ranging from 0.85 to 1.10.

Figure 9.9 shows the first three eigenvectors in a parallel coordinate plot. The basic structure of the first three eigenvectors is not altered. We find a shift, a slope and a twist structure. This structure is common to all maturity groups, i.e., when exploiting PCA as a dimension reducing tool, the same transformation applies to each group! However, by comparing the size of eigenvalues among groups we find that variability is decreasing across groups as we move from the short term contracts to long term contracts.

Figure: Factor loadings of the first (thick), the second (medium), and the third (thin) PC 33865 MVAcpcaiv.xpl
\includegraphics[width=1\defpicwidth]{MVAcpcaiv.ps}

Before drawing conclusions we should convince ourselves that the CPC model is truly a good description of the data. This can be done by using a likelihood ratio test. The likelihood ratio statistic for comparing a restricted (the CPC) model against the unrestricted model (the model where all covariances are treated separately) is given by

\begin{displaymath}T_{{(n_1, n_2, ..., n_k)}}=-2\ln\frac{L(\widehat{\Sigma}_1,
...,\widehat{\Sigma}_k)}{L({{\data{S}}}_1, ...,{{\data{S}}}_k)}.
\end{displaymath}

Inserting the likelihood function, we find that this is equivalent to

\begin{displaymath}
T_{(n_1, n_2, ..., n_k)} = \sum_{i=1}^k (n_i-1)
\frac{\text{det}\ (\widehat{\Sigma}_i)}{\text{det}\ ({\data{S}}_i)},
\end{displaymath}

which has a $\chi^2$ distribution as $\min(n_i)$ tends to infinity with

\begin{displaymath}
k\Bigl\{\frac{1}{2}p(p-1)+1\Bigr\}-\Bigl\{\frac{1}{2}p(p-1)+kp\Bigr\}=\frac{1}{2}(k-1)p(p-1)
\end{displaymath}

degrees of freedom. This test is included in the quantlet 33868 MVAcpcaiv.xpl .

The calculations yield $T_{(n_1, n_2, ..., n_k)} = 31.836$, which corresponds to the $p$-value $p=0.37512$ for the $\chi^2(30)$ distribution. Hence we cannot reject the CPC model against the unrestricted model, where PCA is applied to each maturity separately.

Using the methods in Section 9.3, we can estimate the amount of variability, $\zeta_l$, explained by the first $l$ principle components: (only a few factors, three at the most, are needed to capture a large amount of the total variability present in the data). Since the model now captures the variability in both the strike and maturity dimensions, this is a suitable starting point for a simplified VaR calculation for delta-gamma neutral option portfolios using Monte Carlo methods, and is hence a valuable insight in risk management.