In Econometrics the relationships between economic variables proposed by the Economic Theory are usually studied within the framework of linear regression models (see chapters 1 and 2). The data of many economic and business variables are collected in the form of time series. In this section we deal with the problems that may appear when estimating regression models with time series data.
It can be proved that many of the results on the properties of LS estimators
and inference rely on
the assumption of stationarity of the explanatory variables. Thus, the
standard proof of consistency of the LS estimator depends on the
assumption plim
, where
is the data matrix and
is a
fixed matrix. This assumption implies that the sample moments converge to
the population values as the sample size
increases. But the
explanatory variables must be stationary in order to have fixed
values in the matrix
.
As it has been discussed in section 4.3.2, many of the
macroeconomic, finance, monetary variables are nonstationary presenting
trending behaviour in most cases. From an econometric point view, the
presence of a deterministic trend (linear or not) in the explanatory
variables does not raise any problem. But many economic and business time
series are nonstationary even after eliminating deterministic trends due to
the presence of unit roots, that is, they are generated by integrated
processes. When nonstationary time series are used in a regression
model one may obtain apparently significant relationships from unrelated
variables. This phenomenom is called spurious regression.
Granger and Newbold (1974) estimated regression models of the type:
![]() |
![]() |
![]() |
|
![]() |
![]() |
![]() |
These results found by Granger and Newbold (1974) were analytically explained
by Phillips (1986). He shows that the t-ratios in model (4.54)
do not follow a t-Student distribution and they go to infinity as
increases. This implies that for any critical value the ratios of rejection
of the null hypothesis
increase with
. Phillips (1986)
showed as well that the D-W statistic converges to zero as
goes to
infinity, while it converges to a value different
from zero when the variables are related. Then, the value of the D-W
statistic may help us to distinguish between genuine and spurious
regressions. Summarizing, the spurious regression results are due to the
nonstationarity of the variables and the problem is not solved by
increasing the sample size
, it even gets worse.
Due to the problems raised by regressing nonstationary variables, econometricians have looked for solutions. One classical approach has been to detrend the series adjusting a determinist trend or including directly a deterministic function of time in the regression model (4.54) to take into account the nonstationary behaviour of the series. However, Phillips (1986) shows that this does not solve the problem if the series are integrated. The t-ratios in the regression model with a deterministic trend do not follow a t-Student distribution and therefore standard inference results could be misleading. Furthermore, it still appears spurious correlation between detrended random walks, that is, spurious regression. A second approach to work with nonstationary series is to look for relationships between stationary differenced series. However, it has to be taken into account that the information about the long-run relationship is lost, and the economic relationship may be different between levels and between increments.
When estimating regression models using time series data it is necessary to know whether the variables are stationary or not (either around a level or a deterministic linear trend) in order to avoid spurious regression problems. This analysis can be perform by using the unit root and stationarity tests presented in section 4.3.3.
It is well known that if two series are integrated to different orders,
linear combinations of them will be integrated to the higher of the two
orders. Thus, for instance, if two economic variables
are
, the linear combination of them,
, will be
generally
. But it is possible that certain combinations of those
nonstationary series are stationary. Then it is said that the pair
is cointegrated.
The notion of cointegration is important to the analysis of long-run
relationships between economic time series. Some examples are disposable
income and consumption, goverment spending and tax revenues or interest
rates on assets of differents maturities. Economic theory suggests that
economic time series vectors should move jointly, that is, economic
time series should be characterized by means of a long-run equilibrium
relationship. Cointegration implies that these pairs of variables have
similar stochastic trends. Besides, the dynamics of the economic variables
suggests that they may deviate from this equilibrium in the short term, and
when the variables are cointegrated the term
is stationary.
The definition of cointegration can be generalized to a set of
variables (Engle and Granger; 1987): The components of the vector
are said to
be co-integrated of order d,b denoted
, if (i) all
components of
are
; (ii) there exists a vector
so that
. The vector
is
called the co-integrating vector.
The relationship
captures the long-run equilibrium. The
term
represents the deviation from the long-run
equilibrium so it is called the equilibrium error. In general, more than
one cointegrating relationship may exist between
variables, with a
maximum of
. For the case of two
variables, the long-run
equilibrium can be written as
and the
cointegrating vector is
).
Clearly the cointegrating vector is not unique, since by multiplying both
sides of
by a nonzero scalar the equality remains valid.
![]() |
![]() |
![]() |
|
![]() |
![]() |
![]() |
|
At is has been mentioned above, a classical approach to build regression
models for nonstationary variables is to difference the series in order to
achieve stationarity and analyze the relationship between stationary
variables. Then, the information about the long-run relationship is lost.
But the presence of cointegration between regressors and dependent variable
implies that the level of these variables are related in the log-run. So,
although the variables are nonstationary, it seems more appropriate in this
case to estimate the relationship between levels, without differencing the
data, that is, to estimate the cointegrating relationship. On the other
hand, it could be interesting as well to formulate a model that combines
both long-run and short-run behaviour of the variables. This approach is
based on the estimation of error correction models () that relate the
change in one variable to the deviations from the long-run equilibrium in
the previous period. For example, an
for two
variables can be
written as:
This model can be generalized as follows (Engle and Granger; 1987):
a vector of time series has an error correction representation if it
can be expressed as:
In the previous example it means that the following equation is estimated
by least squares:
![]() |
![]() |
![]() |