13.1 Introduction

In contingency table, the data are classified according to each of two characteristics. The attributes on each characteristic are represented by the row and the column categories. We will denote by $ n_{ij}$ the number of individuals with the $ i$-th row and $ j$-th column attributes. The contingency table itself is the $ (I\times J)$ matrix containing the elements $ n_{ij}$.


13.1.1 Singular Value Decomposition

Total variation in the contingency table is measured by departure from independence, i.e., more precisely, by the $ \chi^2$ statistic

$\displaystyle \chi^2=\sum_{i=1}^I\sum_{j=1}^J (n_{ij}-E_{ij})^2/E_{ij},
$

where $ n_{ij}$, $ i=1,\dots,I$, $ j=1,\dots,J$ are the observed frequencies and $ E_{ij}$ is the estimated expected value in the cell $ (i,j)$ under the assumption of independence

$\displaystyle E_{ij}=\frac{n_{i\bullet}n_{\bullet j}}{n}.
$

We define

$\displaystyle M = (n_{ij} -n_{i \bullet} n_{\bullet j} /n).$

The matrix $ M$ contains the differences between the observed frequency and the frequency estimated under assumption of independence.

The $ \chi^2$ statistic which measures the departure of independence can be rewritten as

$\displaystyle n tr(M^T RMC),$

where $ R =
diag(1/n_{i \bullet})$ and $ C = diag(1/n_{\bullet j})$.

The CA itself consists of finding the singular value decomposition (SVD) of the matrix $ R^{1/2}MC^{1/2}$. In this way, we obtain approximations of the matrix $ R^{1/2}MC^{1/2}$ by matrices of lower rank:

$\displaystyle M = (g_{1})^{-1/2}r_{1}c_{1}^T +
(g_{2})^{-1/2}r_{2}c_{2}^T + ... + (g_{u})^{-1/2}r_{u}c_{u}^T,$

where $ (g_1)^{-1/2}r_{1}c_{1}^T$ is the matrix of rank one closest to $ M$ in the chi-square norm, $ (g_{1})^{-1/2}r_{1}c_{1}^T +
(g_{2})^{-1/2}r_{2}c_{2}^T$ is the matrix of rank two closest to $ M$ in the chi-square norm and so on. The $ g_{k}$'s are the eigenvalues of $ M^{T}RMC$ in decreasing order and $ c^T_{k}c_{k} =
r^T_{k}r_{k} = g_{k}$.


13.1.2 Coordinates of Factors

The $ I\times 1$ vector $ r_{k}$, defines the coordinates of the rows corresponding to the $ k$-th factor. Similarily, the $ J\times 1$ vector $ c_{k}$ defines the coordinates of columns corresponding to the $ k$-th factor.

A set of $ u$ coordinates for row (resp. column) items, where $ u =
\min(I, J) -1$ is hierarchically constructed via singular value decomposition.Thus the construction is similar to the PCA, however with a different matrix norm in order to take into account the specific frequency nature of the data.

For the sake of simplicity, the vector of first row coordinates is called the first factor (as well as the vector of the first coordinates for columns), and so on up to the $ u$-th factor.