2.2 Classical Assumptions of the MLRM

In order to carry out the estimation and inference in the MLRM, the specification of this model includes a set of assumptions referring to the way of generating the data, that is to say, referring to the underlying Data Generating Process (DGP). These assumptions can be grouped into two categories:

2.2.1 The Systematic Component Assumptions Assumption 1: Strict exogeneity of the regressors.

In Economics, it is very difficult to have experimental data (obtained from a controlled experiment), so it seems reasonable to assume that the set of variables included in the model should be random variables. Following Engle, Hendry, and Richard (1983), we can say that the regressors are $ \textsl{strictly}$ $ \textsl{exogenous}$ if $ x_{j}$ is independent of $ u$, $ \forall$ $ j$. This means that, given the DGP, for each observed sample (realization) of every variable included in $ X$, there are infinite possible realizations of $ u$ and $ y$; this fact leads us to deal with the distribution of $ u\vert X$. This assumption allows us to express the joint distribution function of $ X$ and $ u$ as :

$\displaystyle F(u,X)=F(u)\cdot F(X)$ (2.4)

or alternatively:

$\displaystyle F(u\vert X)=F(u)$ (2.5)

Nevertheless, in this chapter we are going to adopt a more restrictive assumption, considering the variables in $ X$ as non stochastic, that is to say, the elements in $ X$ are fixed for repeated samples. Obviously, this hypothesis allows us to maintain result (2.5). Thus, we can conclude that the randomness of $ y$ is only due to the disturbance term.

Additionally, the X matrix satisfies the following condition:

$\displaystyle \lim_{n\rightarrow\infty}\frac{X^{\top }X}{n}=Q$ (2.6)

where $ Q$ is a non singular positive definite matrix with finite elements. Assumption 2: matrix $ X$ has full rank.

Analytically, this assumption is written:

$\displaystyle r(X)=k$ (2.7)

and it means that the columns of X are linearly independent, or in other words, no exact linear relations exist between any of the $ X$ variables. This assumption is usually denoted $ \textsl{non
perfect multicollinearity}$. A direct consequence of (2.7) is that $ n$ $ \geq$ $ k$. Assumption 3: Stability of the parameter vector $ \beta $.

This assumption means that the coefficients of the model do not vary across sample observations, that is to say, we assume the same model for all the sample.

2.2.2 The Random Component Assumptions Assumption 4: Zero mean of the disturbance.

Analytically, we write this as:

\begin{displaymath}\begin{array}{cc} \textrm{E}(u_{i}\vert X)=\textrm{E}(u_{i})=0 & \forall i \end{array}\end{displaymath} (2.8)

or in matrix notation:

$\displaystyle \textrm{E}(u)=0_{n}$ (2.9)

Since $ u$ is usually considered as the sum of many individual factors whose sign is unknown, we assume that on average, these effects are null. Assumption 5: Spherical errors.

This assumption states that the disturbances have constant variance, and they are non correlated.

\begin{displaymath}\begin{array}{cc} var(u_{i})=\textrm{E}(u_{i}^{2})=\sigma^{2} & \forall i \end{array}\end{displaymath} (2.10)

\begin{displaymath}\begin{array}{cc} cov(u_{i},u_{i'})=\textrm{E}(u_{i}u_{i'})=0 & \forall i\neq i' \end{array}\end{displaymath} (2.11)

The condition (2.10) is known as $ \textsl{homoskedasticity}$, and it states that all $ u_{i}$ have the same dispersion around their mean, whatever the values of the regressors. The condition (2.11), related to the covariance of the disturbances, is called $ \textsl{non
autocorrelation}$, and it means that knowing the $ i^{th}$ disturbance does not tell us anything about the $ i'^{th}$ disturbance, for $ i\neq i'$. Both hypotheses can be summarized in matrix form through the variance-covariance matrix $ V(u)$:

$\displaystyle V(u)=\textrm{E}[(u-\textrm{E}u)(u-\textrm{E} u)^{\top }]=\textrm{E}(uu^{\top })=\sigma^{2}I_{n}$ (2.12) Assumption 6: The disturbance vector is normally distributed.

This hypothesis, together with (2.9) and (2.12) allows us to summarize the assumptions of the disturbance term as follows:

$\displaystyle u \sim N[0_{n},\sigma^{2}I_{n}]$ (2.13)

From (2.13) we derive that all observations of $ u$ are independent.

We can find some text books (Baltagi (1999), Davidson (2000), Hayashi (2000), Intriligator, Bodkin, and Hsiao (1996), Judge, Carter, Griffiths, Lutkepohl and Lee (1988)) which do not initially include this last assumption in their set of classical hypotheses, but it is included later. This fact can be justified because it is possible to get the estimate of the parameters of the model by the Least Square method, and derive some of their properties, without using the normality assumption.

From (2.13), the joint density function of the $ n$ disturbances is given by:

$\displaystyle f(u)=f(u_{1},u_{2},\ldots,u_{n})=\frac{1}{(2\pi\sigma^{2})^{n/2}}exp\{-\frac{1}{2\sigma^{2}}u^{\top }u\}$ (2.14)

The set of classical assumptions described above allows us to obtain the probability distribution of the endogenous variable ($ y$) as a multivariate normal distribution, with the following moments:

$\displaystyle \textrm{E}(y)=\textrm{E}(X\beta+u)=X\beta+\textrm{E}(u)=X\beta$ (2.15)

$\displaystyle V(y)=\textrm{E}[(y-\textrm{E}y)(y-\textrm{E} y)^{\top }]=\textrm{E}(uu^{\top })=\sigma^{2}I_{n}$ (2.16)

These results imply that, the expectation for every component of $ y$, depends on the corresponding row of $ X$, while all elements have the same variance, and are independent.

Obviously, we can not assure that the set of classical assumptions that we have previously described are always maintained. In practice, there are many situations for which the theoretical model which establishes the relationship among the set variables does not satisfy the classical hypotheses mentioned above. A later section of this chapter and the following chapters of this book study the consequences when some of the "ideal conditions" fails, and describe how to proceed in this case. Specifically, at the end of this chapter we deal with the non stability of the coefficient vector $ \beta $.