7.2 Linear Hypothesis

In this section, we present a very general procedure which allows a linear hypothesis to be tested, i.e., a linear restriction, either on a vector mean $\mu$ or on the coefficient $\beta$ of a linear model. The presented technique covers many of the practical testing problems on means or regression coefficients.

Linear hypotheses are of the form $\data{A}\mu =a$ with known matrices $\data{A}(q\times p)$ and $a (q\times 1)$ with $q \le p$.

EXAMPLE 7.7   Let $\mu=(\mu_{1},\mu_{2})^{\top}$. The hypothesis that $\mu_{1}=\mu_{2}$ can be equivalently written as:

\begin{displaymath}\data{A}\mu = \left( \begin{array}{cc} 1&{-1}\\ \end{array}\r...
...( \begin{array}{c} \mu_{1}\\ \mu_{2} \end{array}\right) = 0 =a.\end{displaymath}

The general idea is to test a normal population $H_0:\; {\cal{A}}\mu=a$ (restricted model) against the full model $H_1$ where no restrictions are put on $\mu$. Due to the properties of the multinormal, we can easily adapt the Test Problems 1 and 2 to this new situation. Indeed we know, from Theorem 5.2, that $y_i={\cal{A}}x_i \sim N_q(\mu_y,\Sigma_y)$, where $\mu_y={\cal{A}}\mu$ and $\Sigma_y={\cal{A}}\Sigma{\cal{A}}^{\top}$.

Testing the null $H_0:\; {\cal{A}}\mu=a$, is the same as testing $H_0:\;
\mu_y=a$. The appropriate statistics are $\bar y$ and ${\cal{S}}_y$ which can be derived from the original statistics $\bar x$ and ${\cal{S}}$ available from ${\cal{X}}$:

\begin{displaymath}
\bar y = {\cal{A}} \bar x,\quad {\cal{S}}_y={\cal{A}}{\cal{S}}{\cal{A}}^{\top}.
\end{displaymath}

Here the difference between the sample mean and the tested value is $d={\cal{A}} \bar x -a$. We are now in the situation to proceed to Test Problem 5 and 6.


TEST PROBLEM 5   Suppose $X_{1},\ldots,X_{n}$ is an i.i.d. random sample from a $N_p(\mu,\Sigma)$ population.

\begin{displaymath}H_0:\data{A}\mu =a,\ \Sigma\ \textrm{known versus}\ H_1:\
\textrm{no constraints.}\end{displaymath}



By (7.2) we have that, under $H_0$:

\begin{displaymath}n(\data{A}\bar{x}-a)^{\top}(\data{A}\Sigma \data{A}^{\top})^{-1}(\data{A}\bar{x}-a)\sim \data{X}^2_q,\end{displaymath}

and we reject $H_0$ if this test statistic is too large at the desired significance level.

EXAMPLE 7.8   We consider hypotheses on partitioned mean vectors $\mu
=\left({\mu_{1}\atop\mu_{2}}\right)$. Let us first look at

\begin{displaymath}H_0:\mu_1=\mu_2,\ \textrm{versus}\ H_1:\textrm{no constraints,}\end{displaymath}

for $N_{2p}({\mu_1 \choose \mu_2},\left({\Sigma\atop
0}{0\atop\Sigma}\right))$ with known $\Sigma$. This is equivalent to $\data{A}=(\data{I},-\data{I})$, $a=(0,\dots,0)^{\top}\in\mathbb{R}^p$ and leads to:

\begin{displaymath}-2\log\lambda = n(\overline x_1 - \overline x_2) (2\Sigma)^{-1}
(\overline x_1 - \overline x_2) \sim \chi^2_p.\end{displaymath}

Another example is the test whether $\mu_1=0$, i.e.,

\begin{displaymath}H_0:\mu_1=0,\ \textrm{versus}\ H_1:\textrm{no constraints,}\end{displaymath}

for $N_{2p}({\mu_1 \choose \mu_2},\left({\Sigma\atop
0}{0\atop\Sigma}\right))$ with known $\Sigma$. This is equivalent to $\data{A}\mu =a$ with $\data{A}=(\data{I},0)$, and $a=(0,\dots,0)^{\top}\in\mathbb{R}^p$. Hence:

\begin{displaymath}-2\log\lambda =n\overline x_1\Sigma^{-1}\overline x_1
\sim\chi^2_p.\end{displaymath}


TEST PROBLEM 6   Suppose ${X}_{1}, \ldots, {X}_{n}$ is an i.i.d. random sample from a $N_p(\mu,\Sigma)$ population.

\begin{displaymath}H_0:\data{A}\mu =a, \ \Sigma\ \textrm{unknown versus}\
H_1:\ \ \textrm{no constraints.}\end{displaymath}



From Corollary (5.4) and under $H_0$ it follows immediately that

$\displaystyle (n-1)(\data{A}\overline x-a)^{\top}
(\data{ASA}^{\top})^{-1}(\data{A}\overline x-a)
\sim$ $\textstyle T^2(q,n-1)$   (7.9)

since indeed under $H_0$,

\begin{displaymath}\data{A}\overline x\sim N_q(a,n^{-1}\data{A}\Sigma \data{A}^{\top})\end{displaymath}

is independent of

\begin{displaymath}n\data{ASA}^{\top}\sim W_q(\data{A}\Sigma \data{A}^{\top},n-1).\end{displaymath}

EXAMPLE 7.9   Let's come back again to the bank data set and suppose that we want to test if $\mu_4 =\mu_5$, i.e., the hypothesis that the lower border mean equals the larger border mean for the forged bills. In this case:

\begin{eqnarray*}
\data{A} &=&(0\ 0\ 0\ 1 -1\ 0)\\
a&=&0.
\end{eqnarray*}



The test statistic is:

\begin{displaymath}99 (\data{A}\bar{x})^{\top}(\data{A}S_f\data{A}^{\top})^{-1}(\data{A}\bar{x}) \sim T^2(1,99)=F_{1,99}.\end{displaymath}

The observed value is $13.638$ which is significant at the $5\%$ level.

Repeated Measurements

In many situations, $n$ independent sampling units are observed at $p$ different times or under $p$ different experimental conditions (different treatments,...). So here we repeat $p$ one-dimensional measurements on $n$ different subjects. For instance, we observe the results from $n$ students taking $p$ different exams. We end up with a $(n\times p)$ matrix. We can thus consider the situation where we have ${X}_1,\ldots ,{X}_n$ i.i.d. from a normal distribution $N_p(\mu,\Sigma)$ when there are $p$ repeated measurements. The hypothesis of interest in this case is that there are no treatment effects, $H_0:\mu_1=\mu_2=\ldots =\mu_p$. This hypothesis is a direct application of Test Problem 6. Indeed, introducing an appropriate matrix transform on $\mu$ we have:

\begin{displaymath}
H_0:\ {\cal{C}}\mu=0\ \textrm{where}\ {\cal{C}} ((p-1)\times...
... \vdots &\vdots \\
0 & \cdots & 0 &1 &-1
\end{array}\right).
\end{displaymath} (7.10)

Note that in many cases one of the experimental conditions is the ``control'' (a placebo, standard drug or reference condition). Suppose it is the first component. In that case one is interested in studying differences to the control variable. The matrix ${\cal{C}}$ has therefore a different form

\begin{displaymath}{\cal{C}} ((p-1)\times p) =
\left(\begin{array}{ccccc}
1 &-1 ...
...\vdots & \vdots \\
1 & 0 & 0 & \cdots &-1
\end{array}\right). \end{displaymath}

By (7.9) the null hypothesis will be rejected if:

\begin{displaymath}\frac{(n-p+1)}{p-1}\bar{x}^{\top}{\cal{C}}^{\top}({\cal{C}}S{\cal{C}}^{\top})^{-1}{\cal{C}}\bar{x}>F_{1-\alpha ;p-1,n-p+1}.\end{displaymath}

As a matter of fact, ${\cal{C}}\mu$ is the mean of the random variable $y_i={\cal{C}}x_i$

\begin{displaymath}y_i\sim N_{p-1}({\cal{C}}\mu, {\cal{C}}\Sigma {\cal{C}}^{\top}).\end{displaymath}

Simultaneous confidence intervals for linear combinations of the mean of $y_i$ have been derived above in (7.7). For all $a\in \mathbb{R}^{p-1}$, with probability $(1-\alpha)$ we have:

\begin{displaymath}a^{\top}{\cal{C}}\mu \in a^{\top}{\cal{C}}\bar{x} \pm\sqrt{\f...
...+1}F_{1-\alpha ;p-1,n-p+1}a^{\top}{\cal{C}}S{\cal{C}}^{\top}a}.\end{displaymath}

Due to the nature of the problem here, the row sums of the elements in ${\cal{C}}$ are zero: ${\cal{C}}1_p=0$, therefore $a^{\top}{\cal{C}}$ is a vector whose sum of elements vanishes. This is called a contrast. Let $b={\cal{C}}^{\top}a$. We have $b^{\top}1_p=\sum\limits_{j=1}^{p}b_j=0$. The result above thus provides for all contrasts of $\mu$, and $b^{\top}\mu$ simultaneous confidence intervals at level $(1-\alpha)$

\begin{displaymath}b^{\top}\mu \in b^{\top}\bar{x} \pm\sqrt{\frac{(p-1)}{n-p+1}F_{1-\alpha ;p-1,n-p+1}b^{\top}{\cal{S}}b}.\end{displaymath}

Examples of contrasts for $p=4$ are $b^{\top}=(1\ -1\ 0\ 0)$ or $(1\ 0\ 0\ -1)$ or even $(1\ -\frac{1}{3}\ -\frac{1}{3}\ -\frac{1}{3})$ when the control is to be compared with the mean of 3 different treatments.

EXAMPLE 7.10   Bock (1975) considers the evolution of the vocabulary of children from the eighth through eleventh grade. The data set contains the scores of a vocabulary test of 40 randomly chosen children that are observed from grades 8 to 11. This is a repeated measurement situation, $(n=40, p=4)$, since the same children were observed from grades 8 to 11. The statistics of interest are:

\begin{eqnarray*}
\bar{x} &=&(1.086, 2.544, 2.851, 3.420)^{\top}\\
\data{S} &=&...
...81 & 2.939\\
2.183 & 2.319 & 2.939 & 3.162
\end{array}\right).
\end{eqnarray*}



Suppose we are interested in the yearly evolution of the children. Then the matrix ${\cal{C}}$ providing successive differences of $\mu_j$ is:

\begin{eqnarray*}
{\cal{C}} &=&\left(\begin{array}{rrrr}
1 & -1 & 0 & 0\\
0 & 1 & -1 & 0\\
0 & 0 & 1 & -1
\end{array}\right).
\end{eqnarray*}



The value of the test statistic is $F_{\textrm{obs}}=53.134$ which is highly significant for $F_{3.37}.$ There are significant differences between the successive means. However, the analysis of the contrasts shows the following simultaneous $95\%$ confidence intervals

\begin{displaymath}
\begin{array}{lcccr}
-1.958 & \le & \mu_1- \mu_2 & \le & -0....
...-1.171 & \le & \mu_3- \mu_4 & \le & 0.036.\nonumber
\end{array}\end{displaymath}

Thus, the rejection of $H_0$ is mainly due to the difference between the childrens' performances in the first and second year. The confidence intervals for the following contrasts may also be of interest:

\begin{displaymath}
\begin{array}{lcccl}
-2.283 & \le & \mu_1-\frac{1}{3}(\mu_2...
...-0.742\\
-1.479 & \le & \mu_2-\mu_4 & \le &-0.272.
\end{array}\end{displaymath}

They show that $\mu_1$ is different from the average of the 3 other years (the same being true for $\mu_4$) and $\mu_4$ turns out to be higher than $\mu_2$ (and of course higher than $\mu_1$).

Test Problem 7 illustrates how the likelihood ratio can be applied when testing a linear restriction on the coefficient $\beta$ of a linear model. It is also shown how a transformation of the test statistic leads to an exact $F$ test as presented in Chapter 3.


TEST PROBLEM 7   Suppose $Y_{1}, \ldots, Y_{n}$, are independent with $Y_{i} \sim
N_{1}(\beta^{\top}x_{i},\sigma^2)$, and $ x_{i}~\in~\mathbb{R}^p$.

\begin{displaymath}H_0:\data{A}\beta =a,\ \sigma^2\ \textrm{unknown versus}\
H_1:\ \textrm{no constraints.}\end{displaymath}



The constrained maximum likelihood estimators under $H_{0}$ are (Exercise 3.24):

\begin{displaymath}\tilde{\beta} = \hat{\beta} - (\data{X}^{\top}\data{X})^{-1}
...
...op}\data{X})^{-1}\data{A}^{\top}\}^{-1}
(\data{A}\hat{\beta}-a)\end{displaymath}

for $\beta$ and $\tilde{\sigma}^2 =
\frac{1}{n} (y-\data{X}\tilde{\beta})^{\top}(y-\data{X}\tilde{\beta})$. The estimate $\hat{\beta}$ denotes the unconstrained MLE as before. Hence, the LR statistic is

\begin{eqnarray*}
-2 \log \lambda & = & 2(\ell_{1}^\ast - \ell_{0}^\ast) \\
& =...
...^2} \right) \\
&\stackrel{\cal L}{\longrightarrow} &\chi^2_{q}
\end{eqnarray*}



where $q$ is the number of elements of $a$. This problem also has an exact $F$-test since

\begin{displaymath}\frac{n-p}{q} \left( \frac{\vert\vert y - \data{X}\tilde{\bet...
...{X}\hat{\beta})^{\top}(y-\data{X}\hat{\beta})}
\sim F_{q,n-p}. \end{displaymath}

EXAMPLE 7.11   Let us continue with the ``classic blue'' pullovers. We can once more test if $\beta=0$ in the regression of sales on prices. It holds that

\begin{displaymath}\beta=0 \ \ \textrm{ iff } \ \ (0\ 1){\alpha \choose \beta}
=0. \end{displaymath}

The LR statistic here is

\begin{displaymath}-2\log \lambda = 0.284 \end{displaymath}

which is not significant for the $\chi_{1}^2$ distribution. The $F$-test statistic

\begin{displaymath}F = 0.231 \end{displaymath}

is also not significant. Hence, we can assume independence of sales and prices (alone). Recall that this conclusion has to be revised if we consider the prices together with advertisement costs and hours of sales managers.

Recall the different conclusion that was made in Example 7.6 when we rejected $H_0: \alpha= 211$ and $\beta=0$. The rejection there came from the fact that the pair of values was rejected. Indeed, if $\beta=0$ the estimator of $\alpha$ would be $\bar{y}= 172.70$ and this is too far from $211$.

EXAMPLE 7.12   Let us now consider the multivariate regression in the ``classic blue'' pullovers example. From Example 3.15 we know that the estimated parameters in the model

\begin{displaymath}X_{1} = \alpha + \beta_{1}X_{2} + \beta_{2}X_{3} + \beta_{3}X_{4}
+ \varepsilon \end{displaymath}

are

\begin{displaymath}\hat{\alpha} = 65.670, \ \ \hat{\beta}_{1} = -0.216,\ \
\hat{\beta}_{2} = 0.485, \ \ \hat{\beta}_{3} = 0.844. \end{displaymath}

Hence, we could postulate the approximate relation:

\begin{displaymath}\beta_{1} \approx -\frac{1}{2} \beta_{2}, \end{displaymath}

which means in practice that augmenting the price by 20 EUR requires the advertisement costs to increase by 10 EUR in order to keep the number of pullovers sold constant. Vice versa, reducing the price by 20 EUR yields the same result as before if we reduced the advertisement costs by 10 EUR. Let us now test whether the hypothesis

\begin{displaymath}H_0:\ \ \beta_{1} = -\frac{1}{2} \beta_{2} \end{displaymath}

is valid. This is equivalent to

\begin{displaymath}\left( 0\ \ 1\ \ \frac{1}{2}\ \ 0 \right) \left( \begin{array...
...\
\beta_{1} \\ \beta_{2} \\ \beta_{3} \end{array} \right) =0. \end{displaymath}

The LR statistic in this case is equal to ( 26739 MVAlrtest.xpl )

\begin{displaymath}-2\log \lambda = 0.012, \end{displaymath}

the $F$ statistic is

\begin{displaymath}F = 0.007. \end{displaymath}

Hence, in both cases we will not reject the null hypothesis.

Comparison of Two Mean Vectors

In many situations, we want to compare two groups of individuals for whom a set of $p$ characteristics has been observed. We have two random samples $\{x_{i1}\}_{i=1}^{n_1}$ and $\{x_{j2}\}_{j=1}^{n_2}$ from two distinct $p$-variate normal populations. Several testing issues can be addressed in this framework. In Test Problem 8 we will first test the hypothesis of equal mean vectors in the two groups under the assumption of equality of the two covariance matrices. This task can be solved by adapting Test Problem 2.

In Test Problem 9 a procedure for testing the equality of the two covariance matrices is presented. If the covariance matrices differ, the procedure of Test Problem 8 is no longer valid. If the equality of the covariance matrices is rejected, an easy rule for comparing two means with no restrictions on the covariance matrices is provided in Test Problem 10.




TEST PROBLEM 8   Assume that $X_{i1} \sim N_{p}(\mu_{1},\Sigma)$, with $ i=1,\cdots,n_{1}$ and
$X_{j2} \sim N_p(\mu_{2},\Sigma)$, with $ j=1,\cdots,n_{2}$, where all the variables are independent.

\begin{displaymath}H_0:\mu_{1} =\mu_{2},\ \textrm{versus}\
H_1:\ \textrm{no constraints.}\end{displaymath}



Both samples provide the statistics $\bar{x}_{k}$ and $\data{S}_{k}$, $k=1,2$. Let $\delta=\mu_1-\mu_2$. We have

\begin{displaymath}
(\bar{x}_{1}-\bar{x}_{2})\sim N_{p}\left(\delta, \frac{n_1+n_2}{n_{1}n_{2}}\Sigma\right)
\end{displaymath} (7.11)


\begin{displaymath}
n_{1}S_{1}+n_{2}S_{2}\sim W_{p}(\Sigma, n_{1}+n_{2}-2).
\end{displaymath} (7.12)

Let $\data{S}$= $(n_{1}+n_{2})^{-1}(n_{1}S_{1}+n_{2}S_{2})$ be the weighted mean of $\data{S}_{1}$ and $\data{S}_{2}$. Since the two samples are independent and since $\data{S}_{k}$ is independent of $\bar{x}_{k}$ (for $k=1,2$) it follows that $\data{S}$ is independent of $(\bar{x}_{1}-\bar{x}_{2}).$ Hence, Theorem 5.8 applies and leads to a $T^2$-distribution:
\begin{displaymath}
\frac{n_{1}n_{2}(n_{1}+n_{2}-2)}{(n_{1}+n_{2})^2}
\left\{ ( ...
..._{1}-\bar{x}_{2} ) -\delta\right\})
\sim T^2(p, n_{1}+n_{2}-2)
\end{displaymath} (7.13)

or

\begin{displaymath}\left\{\left(\bar{x}_{1}
-\bar{x}_{2}\right)-\delta\right\}^...
...}+n_{2})^2}{(n_{1}+n_{2}-p-1)n_{1}n_{2}} F_{p,n_{1}+n_{2}-p-1}.\end{displaymath}

This result, as in Test Problem 2, can be used to test $H_0$: $\delta$=0 or to construct a confidence region for $\delta \in\mathbb{R}^{p}$. The rejection region is given by:
\begin{displaymath}
\frac{n_{1}n_{2}(n_{1}+n_{2}-p-1)}{p(n_{1}+n_{2})^2}
\left(\...
...r{x}_{1}
-\bar{x}_{2}\right) \ge F_{1-\alpha ;p,n_1+n_2-p-1}.
\end{displaymath} (7.14)

A $(1-\alpha)$ confidence region for $\delta$ is given by the ellipsoid centered at $(\bar{x}_1-\bar{x}_2)$

\begin{displaymath}
\left\{\delta-\left(\bar{x}_{1}
-\bar{x}_{2}\right)\right\}...
...n_{1}+n_{2}-p-1)(n_{1}n_{2})} F_{1-\alpha ;p,n_{1}+n_{2}-p-1},
\end{displaymath}

and the simultaneous confidence intervals for all linear combinations $a^{\top}\delta$ of the elements of $\delta$ are given by

\begin{displaymath}
a^{\top}\delta \in a^{\top}(\bar{x}_1-\bar{x}_2) \pm \sqrt{\...
...{1}n_{2})} F_{1-\alpha ;p,n_{1}+n_{2}-p-1} a^{\top}\data{S}a}.
\end{displaymath}

In particular we have at the $(1-\alpha)$ level, for $j=1,\ldots , p$,
\begin{displaymath}
\delta_j \in(\bar{x}_{1j}-\bar{x}_{2j}) \pm \sqrt{\frac{p(n...
...2}
-p-1)(n_{1}n_{2})} F_{1-\alpha ;p,n_{1}+n_{2}-p-1} s_{jj}}.
\end{displaymath} (7.15)

EXAMPLE 7.13   Let us come back to the questions raised in Example 7.5. We compare the means of assets ($X_1$) and of sales ($X_2$) for two sectors, energy (group 1) and manufacturing (group 2). With $n_1=15$, $n_2=10$, and $p=2$ we obtain the statistics:

\begin{displaymath}
\bar{x}_1= \left( \begin{array}{c}
4084 \\ 2580.5 \end{arra...
...=
\left( \begin{array}{c}
4307.2 \\ 4925.2 \end{array} \right)
\end{displaymath}

and

\begin{displaymath}
\data{S}_1= 10^7\left(\begin{array}{cc}
1.6635 & 1.2410\\
...
...{cc}
1.2248 & 1.1425\\
1.1425 & 1.5112
\end{array}\right),
\end{displaymath}

so that

\begin{displaymath}
\data{S}=10^7\left(\begin{array}{cc}
1.4880 & 1.2016\\
1.2016 & 1.4293
\end{array}\right).\nonumber
\end{displaymath}

The observed value of the test statistic (7.14) is $F=2.7036$. Since $F_{0.95;2,22}=3.4434$ the hypothesis of equal means of the two groups is not rejected although it would be rejected at a less severe level ( $F>F_{0.90;2,22}=2.5613$). The 95% simultaneous confidence intervals for the differences ( 26760 MVAsimcidif.xpl ) are given by

\begin{displaymath}
\begin{array}{lcccc}
-4628.6 &\le & \mu_{1a}-\mu_{2a} &\le &...
...\\
-6662.4 &\le & \mu_{1s}-\mu_{2s} &\le & 1973.0.
\end{array}\end{displaymath}

EXAMPLE 7.14   In order to illustrate the presented test procedures it is interesting to analyze some simulated data. This simulation will point out the importantce of the covariances in testing means. We created 2 independent normal samples in $\mathbb{R}^4$ of sizes $n_1=30$ and $n_2=20$ with:

\begin{displaymath}\mu_1=(8, 6, 10, 10)^{\top}\end{displaymath}


\begin{displaymath}\mu_2=(6, 6, 10, 13)^{\top}.\end{displaymath}

One may consider this as an example of ${X}=({X}_1,\ldots ,{X}_n)^{\top}$ being the students' scores from 4 tests, where the 2 groups of students were subjected to two different methods of teaching. First we simulate the two samples with $\Sigma={\data I}_4$ and obtain the statistics:

\begin{eqnarray*}
\bar{x}_1 &=&(7.607, 5.945, 10.213, 9.635)^{\top}\\
\bar{x}_2...
...9 & -0.130\\
0.306 & 0.021 & -0.130 &0.683
\end{array}\right).
\end{eqnarray*}



The test statistic (7.14) takes the value $F=60.65$ which is highly significant: the small variance allows the difference to be detected even with these relatively moderate sample sizes. We conclude (at the 95% level) that:

\begin{displaymath}\begin{array}{rcccr}
0.6213 &\le &\delta_1 &\le & 2.2691\\
-...
...e & 1.6830\\
-4.2614 &\le &\delta_4 &\le & -2.5494
\end{array}\end{displaymath}

which confirms that the means for ${X}_1$ and ${X}_4$ are different.

Consider now a different simulation scenario where the standard deviations are 4 times larger: $\Sigma=16 {\cal{I}}_4$. Here we obtain:

\begin{eqnarray*}
\bar{x}_1 &=&(7.312, 6.304, 10.840, 10.902)^{\top}\\
\bar{x}_...
... &-0.323\\
-6.507 & -2.551 & -0.323 &10.311
\end{array}\right).
\end{eqnarray*}



Now the test statistic takes the value 1.54 which is no longer significant ( $F_{0.95,4,45} =2.58$). Now we cannot reject the null hypothesis (which we know to be false!) since the increase in variances prohibits the detection of differences of such magnitude.

The following situation illustrates once more the role of the covariances between covariates. Suppose that $\Sigma=16 {\cal{I}}_4$ as above but with $\sigma_{14}=\sigma_{41}=-3.999$ (this corresponds to a negative correlation $r_{41}=-0.9997$). We have:

\begin{eqnarray*}
\bar{x}_1 &=&(8.484, 5.908, 9.024, 10.459)^{\top}\\
\bar{x}_2...
...1 &-1.301\\
-1.601 & -2.954 & -1.301 &9.593
\end{array}\right).
\end{eqnarray*}



The value of $F$ is $3.853$ which is significant at the 5% level (p-value = 0.0089). So the null hypothesis $\delta=\mu_1-\mu_2=0$ is outside the 95% confidence ellipsoid. However, the simultaneous confidence intervals, which do not take the covariances into account are given by:

\begin{displaymath}\begin{array}{rcccr}
-0.1837 &\le &\delta_1 &\le & 7.2343\\
...
...e & 2.9438\\
-7.2336 &\le &\delta_4 &\le & 0.5450.
\end{array}\end{displaymath}

They contain the null value (see Remark 7.1 above) although they are very asymmetric for $\delta_1$ and $\delta_4$.

EXAMPLE 7.15   Let us compare the vectors of means of the forged and the genuine bank notes. The matrices $\data{S}_f$ and $\data{S}_g$ were given in Example 3.1 and since here $n_f=n_g=100$, $\data{S}$ is the simple average of $\data{S}_f$ and $\data{S}_g: \data{S}=\frac{1}{2}\left(\data{S}_f+\data{S}_g\right)$.

\begin{eqnarray*}
\bar{x}_g &=&(214.97, 129.94, 129.72, 8.305, 10.168, 141.52)^{...
...r{x}_f &=&(214.82, 130.3, 130.19, 10.53, 11.133, 139.45)^{\top}.
\end{eqnarray*}



The test statistic is given by (7.14) and turns out to be $F=391.92$ which is highly significant for $F_{6,193}$. The 95% simultaneous confidence intervals for the differences $\delta_j=\mu_{gj}-\mu_{fj},\ j=1,\ldots ,p$ are:

\begin{displaymath}
\begin{array}{rcccr}
-0.0443 & \le &\delta_1& \le & 0.3363\\...
...0.6348\\
1.8072 & \le &\delta_6& \le & 2.3268.\\
\end{array}\end{displaymath}

All of the components (except for the first one) show significant differences in the means. The main effects are taken by the lower border $(X_4)$ and the diagonal $(X_6)$.

The preceding test implicitly uses the fact that the two samples are extracted from two different populations with common variance $\Sigma$. In this case, the test statistic (7.14) measures the distance between the two centers of gravity of the two groups w.r.t. the common metric given by the pooled variance matrix $\data{S}$. If $\Sigma_1\not= \Sigma_2$ no such matrix exists. There are no satisfactory test procedures for testing the equality of variance matrices which are robust with respect to normality assumptions of the populations. The following test extends Bartlett's test for equality of variances in the univariate case. But this test is known to be very sensitive to departures from normality.


TEST PROBLEM 9   (Comparison of Covariance Matrices)
Let $X_{ih} \sim N_{p}(\mu_{h},\Sigma_{h})$, $i=1,\dots,n_{h}$, $h=1,\dots,k$ be independent random variables,

\begin{displaymath}H_0:\Sigma_1 =\Sigma_2 =\cdots=\Sigma_k\ \textrm{versus}\
H_1:\ \ \textrm{no constraints.}\end{displaymath}



Each subsample provides $\data{S}_{h}$, an estimator of $\Sigma_h$, with

\begin{displaymath}n_{h}S_{h}\sim W_{p}(\Sigma_{h},n_{h}-1).\end{displaymath}

Under $H_0$, $\sum_{h=1}^{k}n_{h}\data{S}_{h}\sim W_{p}(\Sigma,n-k)$ (Section 5.2), where $\Sigma$ is the common covariance matrix $x$ and $n=\sum_{h=1}^{k}n_{h}$. Let $\data{S}= \frac{n_{1}\data{S}_{1}+\cdots+n_{k}\data{S}_{k}}{n}$ be the weighted average of the $\data{S}_{h}$ (this is in fact the MLE of $\Sigma$ when $H_0$ is true). The likelihood ratio test leads to the statistic
\begin{displaymath}
-2\log\lambda = n\log\mid S\mid-\sum_{h=1}^{k}n_{h}\log\mid S_{h}\mid
\end{displaymath} (7.16)

which under $H_0$ is approximately distributed as a $\data{X}_{m}^2$ where $m=\frac{1}{2}(k-1)p(p+1)$.

EXAMPLE 7.16   Let's come back to Example 7.13, where the mean of assets and sales have been compared for companies from the energy and manufacturing sector assuming that $\Sigma_1 =\Sigma_2$. The test of $\Sigma_1 =\Sigma_2$ leads to the value of the test statistic
\begin{displaymath}
-2\log\lambda=0.9076
\end{displaymath} (7.17)

which is not significant (p-value for a $\chi^2_3 =0.82$). We cannot reject $H_0$ and the comparison of the means performed above is valid.

EXAMPLE 7.17   Let us compare the covariance matrices of the forged and the genuine bank notes (the matrices $S_f$ and $S_g$ are shown in Example 3.1). A first look seems to suggest that $\Sigma_1 \neq \Sigma_2$. The pooled variance $S$ is given by $\data{S}=\frac{1}{2}\left(\data{S}_f+\data{S}_g\right)$ since here $n_f=n_g$. The test statistic here is $-2\log\lambda =127.21$, which is highly significant $\chi^2$ with 21 degrees of freedom. As expected, we reject the hypothesis of equal covariance matrices, and as a result the procedure for comparing the two means in Example 7.15 is not valid.

What can we do with unequal covariance matrices? When both $n_1$ and $n_2$ are large, we have a simple solution:


TEST PROBLEM 10   (Comparison of two means, unequal covariance matrices, large samples)
Assume that ${X}_{i1} \sim N_{p}(\mu_{1},\Sigma_1)$, with $ i=1,\cdots,n_{1}$ and ${X}_{j2} \sim N_p(\mu_{2},\Sigma_2)$, with $ j=1,\cdots,n_{2}$ are independent random variables.

\begin{displaymath}H_0:\mu_{1} =\mu_{2}\ \textrm{versus}\
H_1:\ \textrm{no constraints.}\end{displaymath}



Letting $\delta=\mu_1-\mu_2$, we have

\begin{displaymath}(\bar{x}_1-\bar{x}_2)\sim N_p\left(\delta,\frac{\Sigma_1}{n_1}+\frac{\Sigma_2}{n_2}\right).\end{displaymath}

Therefore, by (5.4)

\begin{displaymath}(\bar{x}_1-\bar{x}_2)^{\top}\left(\frac{\Sigma_1}{n_1}+\frac{\Sigma_2}{n_2}\right)^{-1}
(\bar{x}_1-\bar{x}_2)\sim \chi^2_p.\end{displaymath}

Since $\data{S}_i$ is a consistent estimator of $\Sigma_i$ for $i=1,2$, we have
\begin{displaymath}
(\bar{x}_1-\bar{x}_2)^{\top}\left(\frac{\data{S}_1}{n_1}+\fr...
...
(\bar{x}_1-\bar{x}_2) \stackrel{{\mathcal{L}}}{\to} \chi^2_p.
\end{displaymath} (7.18)

This can be used in place of (7.13) for testing $H_0$, defining a confidence region for $\delta$ or constructing simultaneous confidence intervals for $\delta_j, j=1,\ldots ,p$.

For instance, the rejection region at the level $\alpha$ will be

\begin{displaymath}
(\bar{x}_1-\bar{x}_2)^{\top}\left(\frac{\data{S}_1}{n_1}+\fr...
...}{n_2}\right)^{-1}
(\bar{x}_1-\bar{x}_2)> \chi^2_{1-\alpha ;p}
\end{displaymath} (7.19)

and the $(1-\alpha)$ simultaneous confidence intervals for $\delta_j$, $j=1,\ldots , p$ are:
\begin{displaymath}
\delta_j \in (\bar{x}_1-\bar{x}_2) \pm \sqrt{\chi^2_{1-\alph...
...left(\frac{s^{(1)}_{jj}}{n_1}+\frac{s^{(2)}_{jj}}{n_2}\right)}
\end{displaymath} (7.20)

where $s^{(i)}_{jj}$ is the $(j,j)$ element of the matrix $\data{S}_i$. This may be compared to (7.15) where the pooled variance was used.

REMARK 7.2   We see, by comparing the statistics (7.19) with (7.14), that we measure here the distance between $\bar{x}_1$ and $\bar{x}_2$ using the metric $\left(\frac{\data{S}_1}{n_1}+
\frac{\data{S}_2}{n_2}\right)$. It should be noticed that when $n_1 = n_2$, the two methods are essentially the same since then $\data{S}=\frac{1}{2}\left(\data{S}_1+\data{S}_2\right)$. If the covariances are different but have the same eigenvectors (different eigenvalues), one can apply the common principal component (CPC) technique, see Chapter 9.

EXAMPLE 7.18   Let us use the last test to compare the forged and the genuine bank notes again ($n_1$ and $n_2$ are both large). The test statistic (7.19) turns out to be 2436.8 which is again highly significant. The 95% simultaneous confidence intervals are:

\begin{displaymath}\begin{array}{rcr}
-0.0389 & \le \delta_1 \le & 0.3309\\
-0....
...le & -0.6442\\
1.8146 & \le \delta_6 \le & 2.3194
\end{array}\end{displaymath}

showing that all the components except the first are different from zero, the larger difference coming from $X_6$ (length of the diagonal) and $X_4$ (lower border). The results are very similar to those obtained in Example (7.15). This is due to the fact that here $n_1 = n_2$ as we already mentioned in the remark above.


Profile Analysis

Another useful application of Test Problem 6 is the repeated measurements problem applied to two independent groups. This problem arises in practice when we observe repeated measurements of characteristics (or measures of the same type under different experimental conditions) on the different groups which have to be compared. It is important that the $p$ measures (the ``profile'') are comparable and in particular are reported in the same units. For instance, they may be measures of blood pressure at $p$ different points in time, one group being the control group and the other the group receiving a new treatment. The observations may be the scores obtained from $p$ different tests of two different experimental groups. One is then interested in comparing the profiles of each group: the profile being just the vectors of the means of the $p$ responses (the comparison may be visualized in a two dimensional graph using the parallel coordinate plot introduced in Section 1.7).

We are thus in the same statistical situation as for the comparison of two means:

\begin{displaymath}{X}_{i1}\sim N_p\left(\mu_1,\Sigma\right) \quad i=1,\ldots, n_1\end{displaymath}


\begin{displaymath}{X}_{i2}\sim N_p\left(\mu_2,\Sigma\right) \quad i=1,\ldots, n_2\end{displaymath}

where all variables are independent. Suppose the two population profiles look like Figure 7.1.

Figure 7.1: Example of population profiles 26796 MVAprofil.xpl
\includegraphics[width=1\defpicwidth]{profilneu.ps}

The following questions are of interest:

  1. Are the profiles similar in the sense of being parallel (which means no interaction between the treatments and the groups)?
  2. If the profiles are parallel, are they at the same level?
  3. If the profiles are parallel, is there any treatment effect, i.e., are the profiles horizontal?
The above questions are easily translated into linear constraints on the means and a test statistic can be obtained accordingly.


Parallel Profiles

Let ${\cal{C}}$ be a $(p-1)\times p$ matrix defined as ${\cal{C}}=\left(\begin{array}{crrcr}
1 &-1 & 0 & \cdots & 0\\
0 & 1 &-1 & \cdots & 0\\
0 &\cdots & 0 &1 &-1
\end{array}\right).$
The hypothesis to be tested is

\begin{displaymath}H_0^{(1)}: \, {\cal{C}}(\mu_1-\mu_2)=0.\end{displaymath}

From (7.11), (7.12) and Corollary 5.4 we know that under $H_0$:
\begin{displaymath}
\frac{n_1n_2}{(n_1+n_2)^2}(n_1+n_2-2)\left\{{\cal{C}}(\bar{x...
...})^{-1}{\cal{C}}
(\bar{x}_1-\bar{x}_2) \sim T^2(p-1,n_1+n_2-2)
\end{displaymath} (7.21)

where $\data{S}$ is the pooled covariance matrix. The hypothesis is rejected if

\begin{displaymath}
\frac{n_1n_2(n_1+n_1-p)}{(n_1+n_2)^2(p-1)}\left({\cal{C}}\ba...
...top}\right)^{-1}{\cal{C}}\bar{x}
>F_{1-\alpha ;p-1,n_1+n_2-p}.
\end{displaymath}

Equality of Two Levels

The question of equality of the two levels is meaningful only if the two profiles are parallel. In the case of interactions (rejection of $H_0^{(1)}$), the two populations react differently to the treatments and the question of the level has no meaning.
The equality of the two levels can be formalized as

\begin{displaymath}
H_0^{(2)}: 1_p^{\top}(\mu_1-\mu_2) = 0
\end{displaymath}

since

\begin{displaymath}1_p^{\top}(\bar{x}_1-\bar{x}_2) \sim N_1\left(1_p^{\top}(\mu_1-\mu_2),\frac{n_1+n_2}{n_1n_2}
1_p^{\top}\Sigma 1_p\right)\end{displaymath}

and

\begin{displaymath}
(n_1+n_2)1_p^{\top}\data{S}1_p \sim W_1(1_p^{\top}\Sigma 1_p,n_1+n_2-2).
\end{displaymath}

Using Corollary 5.4 we have that:
$\displaystyle \frac{n_1n_2}{(n_1+n_2)^2}(n_1+n_2-2)
\frac{\left\{1_p^{\top}(\bar{x}_1-\bar{x}_2)\right\}^2}
{1_p^{\top}\data{S}1_p}$ $\textstyle \sim$ $\displaystyle T^2(1,n_1+n_2-2)$ (7.22)
  $\textstyle =$ $\displaystyle F_{1,n_1+n_2-2.}$  

The rejection region is

\begin{displaymath}\frac{n_1n_2(n_1+n_2-2)}{(n_1+n_2)^2}\frac{\left\{1_p^{\top}(...
...ght\}^2}
{1_p^{\top}\data{S}1_p} > F_{1-\alpha ; 1,n_1+n_2-2}.
\end{displaymath}

Treatment Effect

If it is rejected that the profiles are parallel, then two independent analyses should be done on the two groups using the repeated measurement approach. But if it is accepted that they are parallel, then we can exploit the information contained in both groups (eventually at different levels) to test a treatment effect, i.e., if the two profiles are horizontal. This may be written as:

\begin{displaymath}H_0^{(3)}: {\cal{C}}(\mu_1+\mu_2)=0.\end{displaymath}

Consider the average profile $\bar{x}$:

\begin{displaymath}\bar{x}=\frac{n_1\bar{x}_1+n_2\bar{x}_2}{n_1+n_2}.\end{displaymath}

Clearly,

\begin{displaymath}\bar{x}\sim N_p\left(\frac{n_1\mu_1+n_2\mu_2}{n_1+n_2}, \frac{1}{n_1+n_2}\Sigma\right).\end{displaymath}

Now it is not hard to prove that $H_0^{(3)}$ with $H_0^{(1)}$ implies that

\begin{displaymath}{\cal{C}}\left(\frac{n_1\mu_1+n_2\mu_2}{n_1+n_2}\right)=0.\end{displaymath}

So under parallel, horizontal profiles we have

\begin{displaymath}\sqrt{n_1+n_2}{\cal{C}}\bar{x}\sim N_p(0, {\cal{C}}\Sigma {\cal{C}}^{\top}).\end{displaymath}

From Corollary 5.4 we again obtain
\begin{displaymath}
(n_1+n_2-2)({\cal{C}}\bar{x})^{\top}({\cal{C}}\data{S}{\cal{C}}^{\top})^{-1}{\cal{C}}\bar{x}\sim T^2(p-1, n_1+n_2-2).
\end{displaymath} (7.23)

This leads to the rejection region of $H_0^{(3)}$, namely

\begin{displaymath}\frac{n_1+n_2-p}{p-1}({\cal{C}}\bar{x})^{\top}({\cal{C}}\data...
...l{C}}^{\top})^{-1}{\cal{C}}\bar{x}>F_{1-\alpha; p-1,n_1+n_2-p}.\end{displaymath}

EXAMPLE 7.19   Morrison (1990) proposed a test in which the results of 4 sub-tests of the Wechsler Adult Intelligence Scale (WAIS) are compared for 2 categories of people: group 1 contains $n_1=37$ people who do not have a senile factor and group 2 contains $n_2=12$ people who have a senile factor. The four WAIS sub-tests are $X_1$ (information), $X_2$ (similarities), $X_3$ (arithmetic) and $X_4$ (picture completion). The relevant statistics are

\begin{eqnarray*}
\bar{x}_1 &=& (12.57, 9.57, 11.49, 7.97)^{\top}\\
\bar{x}_2 &...
...83 &4.875\\
7.021 & 8.167 & 4.875 & 11.688
\end{array}\right).
\end{eqnarray*}



The test statistic for testing if the two profiles are parallel is $F=0.4634$, which is not significant ($p$-value $=0.71$). Thus it is accepted that the two are parallel. The second test statistic (testing the equality of the levels of the 2 profiles) is $F =17.21$, which is highly significant ($p$-value $\approx 10^{-4}$). The global level of the test for the non-senile people is superior to the senile group. The final test (testing the horizontality of the average profile) has the test statistic $F= 53.32$, which is also highly significant ($p$-value $ \approx 10^{-14}$). This implies that there are substantial differences among the means of the different subtests.

Summary
$\ast$
Hypotheses about $\mu$ can often be written as $\data{A}\mu =a$, with matrix $\data{A}$, and vector $a$.
$\ast$
The hypothesis $H_{0} : \data{A}\mu=a$ for $X\sim N_{p}(\mu,\Sigma)$ with $\Sigma$ known leads to $-2\log \lambda =n(\data{A}\overline x-a)^{\top}(\data{A}\Sigma
\data{A}^{\top})^{-1}(\data{A}\overline x-a)
\sim\chi^2_{q}$, where $q$ is the number of elements in $a$.
$\ast$
The hypothesis $H_{0} : \data{A}\mu=a$ for $X\sim N_{p}(\mu,\Sigma)$ with $\Sigma$ unknown leads to $-2\log \lambda =n\log\{1+(\data{A}\overline x-a)^{\top}(\data{A}\data{S}
\data{...
...ta{A}\overline x-a)\}
\mathrel{\mathop{\longrightarrow}\limits_{}^{}}\chi^2_{q}$, where $q$ is the number of elements in $a$ and we have an exact test $(n-1)(\data{A}\bar{x}-a)^{\top}(\data{A}S\data{A}^{\top})^{-1}(\data{A}\bar{x}-a)\sim T^2(q,n-1).$
$\ast$
The hypothesis $H_{0} : \data{A}\beta=a$ for $Y_{i} \sim
N_{1}(\beta^{\top}x_{i},\sigma^2)$ with $\sigma^2$ unknown leads to $-2\log \lambda = \frac{n}{2} \log\left( \frac{\vert\vert y - \data{X}\tilde{\be...
...rt\vert^2} -1 \right) \mathrel{\mathop{\longrightarrow}\limits_{}^{}}\chi^2_{q}$, with $q$ being the length of $a$ and with

\begin{displaymath}\frac{n-p}{q}\frac{\left(\data{A}\hat{\beta}-a\right)\left\{\...
...a\right)^{\top}\left(y-\data{X}\hat\beta\right)}\sim F_{q,n-p}.\end{displaymath}