11.6 Unit Root Tests

We have discussed in Example 10.1 the AR(1) process is

$\displaystyle X_t = c + \alpha X_{t-1} + \varepsilon_t.$ (11.43)

Given $ \vert\alpha\vert < 1$, the process is stationary when $ E[X_0] =
\frac{c}{1-\alpha}$ or after the ``decaying process''. The case where $ \alpha=1$ corresponds to the random walk which is non-stationary. The relationship between a stationary AR(1) process and $ \alpha$ close to one is so similar to a random walk that it is often tested whether we have the case $ \alpha=1$ or $ \alpha<1$. To do this the so called unit root tests have been developed.

11.6.1 Dickey-Fuller Tests

The unit root test developed by Dickey and Fuller tests the null hypothesis of a unit root, that is, there is a root for the characteristic equation (11.6) of the AR(1) process with $ z=1$, against the alternative hypothesis that the process has no unit roots. As a basis for the test the following regression used is

$\displaystyle \Delta X_t = (\alpha-1)X_{t-1} + \varepsilon_t,$ (11.44)

which is obtained by rearranging (10.43) with $ c=0$. If $ X_t$ is a random walk, then the coefficient of $ X_{t-1}$ is equal to zero. If, on the other hand, $ X_t$ is a stationary AR(1) process, then the coefficient is negative. The standard $ t$-statistic is formed

$\displaystyle \hat{t}_n = \frac{1-\hat{\alpha}}{\sqrt{\hat{\sigma}^2(\sum_{t=2}^n X_{t-1}^2)^{-1}}},$ (11.45)

where $ \hat{\alpha}$ and $ \hat{\sigma}^2$ are the least squares estimators for $ \alpha$ and the variance $ \sigma^2$ of $ \varepsilon_t$. For increasing $ n$ the statistic (10.45) converges not to a standard normal distribution but instead to the distribution of a functional of Wiener process,

$\displaystyle \hat{t}_n \stackrel{{\mathcal{L}}}{\longrightarrow}
\frac{W^2(1)-1}{2\left\{\int_0^1 W^2(u)du\right\}^{1/2}},
$

where $ W$ is a standard Wiener process. The critical value of the distribution are, for example, at the 1%, 5% and 10% significance levels, -2.58, -1.95, and -1.62 respectively.

A problem with this test is that the normal test significance level (for example 5%) is not reliable when the error terms $ \varepsilon_t$ in (10.44) are autocorrelated. The larger the autocorrelation of $ \varepsilon_t$, the larger the distortion in general will be of the test significance. Ignoring then autocorrelations could lead to rejecting the null hypothesis of a unit root at low significance levels of 5%, when in reality the significance level lies at, for example, 30%. In order to prohibit these negative effects, Dickey and Fuller suggest another regression which contains lagged differences. The regression of this augmented Dickey Fuller Test (ADF) is thus

$\displaystyle \Delta X_t = c + (\alpha-1)X_{t-1} + \sum_{i=1}^{p} \alpha_i \Delta X_{t-i} + \varepsilon_t$ (11.46)

where as with the simple Dickey-Fuller Test the null hypothesis of a unit root is rejected when the test statistic (10.45) is smaller than the critical value (which have been summarized in a table). Problematic is naturally the choice of $ p$. In general it holds that the size of the test is better when $ p$ gets larger, but which causes the test to lose power. This is illustrated in a simulated process. The errors $ \varepsilon_t$ are correlated through the relationship

$\displaystyle \varepsilon_t = \beta \xi_{t-1} + \xi_t
$

where $ \xi_t$ are i.i.d. $ (0,\sigma^2)$. In the next chapter these processes will be referred to as moving average processes of order 1, MA(1). It holds that $ \mathop{\text{\rm Var}}(\varepsilon_t) =
\sigma^2(1+\beta^2),$ $ \gamma_1(\varepsilon_t) =
\mathop{\text{\rm Cov}}(\varepsilon_t,\varepsilon_{t-1}) = \beta \sigma^2$, and $ \gamma_{\tau}(\varepsilon_t)=0$ for $ \tau \ge 2$. For the ACF of $ \varepsilon_t$ we then get

$\displaystyle \rho_\tau(\varepsilon_t) = \left\{ \begin{array}{ll} \frac{\beta}...
...n} \:\: \tau = 1 \\ 0 & \text{\rm wenn} \:\: \tau \ge 2. \\ \end{array} \right.$ (11.47)

For the process

$\displaystyle X_t = \alpha X_{t-1} + \beta \xi_{t-1} + \xi_t$ (11.48)

simulations of the ADF Tests were done and are summarized in an abbreviated form in Table 10.3.


Table 10.3: ADF-Test: Simulated rejection probabilities for the process (10.48) at a nominal significance level of 5% (according to Friedmann (1992)).
    $ \beta$
$ \alpha$ $ p$ -0.99 -0.9 0 0.9
1 3 0.995 0.722 0.045 0.034
  11 0.365 0.095 0.041 0.039
0.9 3 1.000 0.996 0.227 0.121
  11 0.667 0.377 0.105 0.086


As one can see, the nominal significance level of 5% under the null hypothesis ($ \alpha=1$) is held better, if $ p$ is larger. However the power of the test decreases, i.e., the test is no longer capable of distinguishing between a process with unit roots and a stationary process with $ \alpha =0.9$. Thus in choosing $ p$ there is also the conflict between validity and power of the test.

If $ X_t$ is a trend-stationary process as in (10.41), the ADF test likewise does not reject often enough the (false) null hypothesis of a unit root. Asymptotically the probability of rejecting goes to zero. The ADF regression (10.46) can be extended by a linear time trend, i.e., run the regression

$\displaystyle \Delta X_t = c + \mu t + (\alpha-1)X_{t-1} + \sum_{i=1}^{p} \alpha_i \Delta X_{t-i} + \varepsilon_t$ (11.49)

and test the significance of $ \alpha$. The critical values are contained in tables. The ADF test with a time trend (10.49) has power against a trend-stationary process. On the other hand, it loses power as compared to the simple ADF test (10.46), when the true process, for example, is a stationary AR(1) process.

As an empirical example, consider the daily stock prices of the 20 largest German stock companies from Jan. 2, 1974 to Dec. 30, 1996. Table 10.4 displays the ADF test statistics for the logged stock prices for $ p=0$ and $ p=4$. The tests were run with and without a linear time trend. In every regression a constant was included in estimation.


Table 10.4: Unit root tests: ADF Test (Null hypothesis: unit root) and KPSS Test (Null hypothesis: stationary). The augmented portion of the ADF regression as order $ p=0$ and $ p=4$. The KPSS statistic was calculated with the reference point $ T=8$ and $ T=12$. The asterisks indicate significance at the 10% (*) and 1% (**) levels.
ADF KPSS
without time trend . with time trend without time trend with time trend
$ p$ and $ T$ 0 4 . 0 . 4 . 8 12 8 12
ALLIANZ -0 . 68 -0 . 62 2 . 44 2 . 59 24.52** 16.62** 2.36** 1.61**
BASF 0 . 14 0 . 34 2 . 94 3 . 13 23.71** 16.09** 1.39** 0.95**
BAYER -0 . 11 0 . 08 2 . 96 3 . 26 24.04** 16.30** 1.46** 1.00**
BMW -0 . 71 -0 . 66 2 . 74 2 . 72 23.92** 16.22** 2.01** 1.37**
COMMERZ- . . . .
BANK -0 . 80 -0 . 67 1 . 76 1 . 76 22.04** 14.96** 1.43** 0.98**
DAIMLER -1 . 37 -1 . 29 2 . 12 2 . 13 22.03** 14.94** 3.34** 2.27**
DEUTSCHE . . . .
BANK -1 . 39 -1 . 27 2 . 05 1 . 91 23.62** 16.01** 1.70** 1.16**
DEGUSSA -0 . 45 -0 . 36 1 . 94 1 . 88 23.11** 15.68** 1.79** 1.22**
DRESDNER -0 . 98 -0 . 94 1 . 90 1 . 77 22.40** 15.20** 1.79** 1.22**
HOECHST 0 . 36 0 . 50 3 . 24 3 . 37 23.80** 16.15** 1.42** 0.97**
KARSTADT -1 . 18 -1 . 17 1 . 15 1 . 15 20.40** 13.84** 3.33** 2.26**
LINDE -1 . 69 -1 . 44 2 . 74 2 . 70 24.40** 16.54** 3.14** 2.15**
MAN -1 . 78 -1 . 58 1 . 66 1 . 61 21.97** 14.91** 1.59** 1.08**
MANNES- . . . .
MANN -0 . 91 -0 . 80 2 . 73 2 . 55 21.97** 14.93** 1.89** 1.29**
PREUSSAG -1 . 40 -1 . 38 2 . 21 2 . 03 23.18** 15.72** 1.53** 1.04**
RWE -0 . 09 -0 . 04 2 . 95 2 . 84 24.37** 16.52** 1.66** 1.14**
SCHERING 0 . 11 0 . 04 2 . 37 2 . 12 24.20** 16.40** 2.35** 1.60**
SIEMENS -1 . 35 -1 . 20 2 . 13 1 . 84 23.24** 15.76** 1.69** 1.15**
THYSSEN -1 . 45 -1 . 34 1 . 92 1 . 90 21.97** 14.90** 1.98** 1.35**
VOLKS- . . . .
WAGEN -0 . 94 -0 . 81 1 . 89 1 . 73 21.95** 14.89** 1.11** 0.76**
16218 SFEAdfKpss.xpl


Only for RWE with a linear time trend does the ADF test reject the null hypothesis of a unit root by a significance level of 10%. Since in all other cases no unit root is rejected, it appears that taking differences of stock prices is a necessary operation in order to obtain a stationary process, i.e., to get log returns that can be investigated further. These results will be put into question in the next section using another test.

11.6.2 The KPSS Test of Stationarity

The KPSS Test from Kwiatkowski et al. (1992) tests for stationarity, i.e., for a unit root. The hypotheses are thus exchanged from those of the ADF test. As with the ADF test, there are two cases to distinguish between, whether to estimate with or without a linear time trend. The regression model with a time trend has the form

$\displaystyle X_t = c + \mu t + k \sum_{i=1}^t \xi_i + \eta_t,$ (11.50)

with stationary $ \eta_t$ and i.i.d. $ \xi_t$ with an expected value 0 and variance 1. Obviously for $ k \ne 0$ the process is integrated and for $ k=0$ trend-stationary. The null hypothesis is $ H_0: k=0$, and the alternative hypothesis is $ H_1: k\ne 0$.

Under $ H_0$ the regression (10.50) is run with the method of the least squares obtaining the residuals $ \hat{\eta}_t$. Using these residuals the partial sum

$\displaystyle S_t = \sum_{i=1}^t \hat{\eta}_i,
$

is built which under $ H_0$ is integrated of order 1, i.e., the variance $ S_t$ increases linearly with $ t$. The KPSS test statistic is then

$\displaystyle KPSS_T = \frac{\sum_{t=1}^n S_t^2}{n^2 \hat{\omega}_T^2},$ (11.51)

where

$\displaystyle \hat{\omega}_T^2 = \hat{\sigma}^2_{\eta} + 2 \sum_{\tau=1}^T (1-\frac{\tau}{T-1}) \hat{\gamma}_\tau
$

is an estimator of the spectral density at a frequency of zero where $ \hat{\sigma}^2_{\eta}$ is the variance estimator of $ \eta_t$ and $ \hat{\gamma}_\tau=1/n\sum_{t=\tau+1}^n \hat{\eta}_t
\hat{\eta}_{t-\tau}$ is the covariance estimator. The problem again is to determine the reference point $ T$: for $ T$ that are too small the test is biased when there is autocorrelation, for $ T$ that is too large it loses power.

The results of the KPSS tests in Table 10.4 clearly indicate that the investigated stock prices are not stationary or trend-stationary, since in every case the null hypothesis at a significance level of 1% was rejected. Even RWE, which was significant under the ADF test at a significance level of 10 %, implies a preference of the hypothesis of unit roots here at a lower significance level.

11.6.3 Variance Ratio Tests

If one wants to test whether a time series follows a random walk, one can take advantage of the fact that the variance of a random walk increases linearly with time, see (10.4). Considering the log prices of a financial time series, $ \ln S_t$, the null hypothesis would be

$\displaystyle H_0: r_t = \mu + \varepsilon_t, \quad \varepsilon_t \sim
{\text{\rm N}}(0,\sigma^2)
$

with log returns $ r_t = \ln S_t - \ln S_{t-1}$, constant $ \mu$ and $ \varepsilon_t$ white noise. An alternative hypothesis is, for example, that $ r_t$ is stationary and autocorrelated. The sum over the returns is formed

$\displaystyle r_t(q) = r_t+r_{t-1}+\ldots+r_{t-q+1}
$

and the variance of $ r_t(q)$ is determined. For $ q=2$ it holds that, for example,
$\displaystyle \mathop{\text{\rm Var}}\{r_t(2)\}$ $\displaystyle =$ $\displaystyle \mathop{\text{\rm Var}}(r_t)+\mathop{\text{\rm Var}}(r_{t-1})+2\mathop{\text{\rm Cov}}(r_t,r_{t-1})$  
  $\displaystyle =$ $\displaystyle 2 \mathop{\text{\rm Var}}(r_t) + 2 \gamma_1$  
  $\displaystyle =$ $\displaystyle 2 \mathop{\text{\rm Var}}(r_t)(1+\rho_1),$  

where taking advantage of the stationarity of $ r_t$, generally

$\displaystyle \mathop{\text{\rm Var}}\{r_t(q)\}=q\mathop{\text{\rm Var}}(r_t)\left(1+2\sum_{\tau=1}^{q-1} (1-\frac{\tau}{q})\rho_\tau\right).$ (11.52)

Under $ H_0$ it holds that $ \rho_{\tau}=0$ for all $ \tau>0$, so that under $ H_0$

$\displaystyle \frac{\mathop{\text{\rm Var}}\{r_t(q)\}}{q\mathop{\text{\rm Var}}(r_t)} =1.
$

A test statistic can now be constructed where the consistent estimator

$\displaystyle \hat{\mu} = \frac{1}{n}(\ln S_n - \ln S_0)
$

for $ \mu$,

$\displaystyle \hat{\gamma_0} = \frac{1}{n-1}\sum_{t=2}^{n}(\ln S_t - \ln S_{t-1} - \hat{\mu})^2
$

for $ \mathop{\text{\rm Var}}(r_t)$ and

$\displaystyle \hat{\gamma_0}(q) = \frac{n}{q(n-q)(n-q+1)}\sum_{t=q+1}^{n}(\ln S_t - \ln S_{t-q} - q\hat{\mu})^2
$

for $ \frac{1}{q} \mathop{\text{\rm Var}}\{r_t(q)\}$ are substituted into (10.52). The test statistic is then

$\displaystyle VQ(q) = \frac{\hat{\gamma_0}(q)}{\hat{\gamma_0}}-1.
$

It can be shown that the asymptotic distribution is

$\displaystyle \sqrt{n}VQ(q) \stackrel{{\cal L}}{\longrightarrow} {\text{\rm N}}
\left(0,\frac{2(2q-1)(q-1)}{3q}\right).
$

The asymptotic variance can be established through the following approximation: Assume that $ \hat{\mu}=0$ and $ n \gg q$. Then we have that $ \ln S_t - \ln S_{t-q}=\sum_{j=0}^{q-1}r_{t-j}$ and
$\displaystyle VQ(q)$ $\displaystyle \approx$ $\displaystyle \frac{1}{qn}\sum_{t=q+1}^n\left((\sum_{j=0}^{q-1}r_{t-j})^2-q\hat{\gamma_0}\right)/\hat{\gamma_0}$  
  $\displaystyle =$ $\displaystyle \frac{1}{qn}\sum_{t=q+1}^n \frac{1}{\hat{\gamma_0}}
\left(\sum_{j...
...2\sum_{j=0}^{q-2}r_{t-j}r_{t-j-1}+\ldots+2r_tr_{t-q+1}
- q\hat{\gamma_0}\right)$  
  $\displaystyle \approx$ $\displaystyle \frac{1}{q}\left(q\hat{\gamma_0}+2(q-1)\hat{\gamma_1}+\ldots+2\hat{\gamma_{q-1}}-q\hat{\gamma_0}\right)/\hat{\gamma_0}$  
  $\displaystyle =$ $\displaystyle 2 \sum_{j=1}^{q-1} \frac{q-j}{q}\hat{\rho_j}.$  

Since under $ H_0$ the estimated autocorrelation $ \hat{\rho_j}$ scaled with $ \sqrt{n}$ is asymptotically standard normal and is independent, see Section 11.5, the asymptotic variance is thus:
$\displaystyle {\mathop{\text{\rm Var}}}_{as}\{\sqrt{n}VQ(q)\}$ $\displaystyle =$ $\displaystyle {\mathop{\text{\rm Var}}}_{as}(2 \sum_{j=1}^{q-1}
\frac{q-j}{q}\sqrt{n}\hat{\rho_j})$  
  $\displaystyle =$ $\displaystyle 4 \sum_{j=1}^{q-1} \frac{(q-j)^2}{q^2}
{\mathop{\text{\rm Var}}}_{as}(\sqrt{n}\hat{\rho_j})$  
  $\displaystyle =$ $\displaystyle 4 \sum_{j=1}^{q-1} \frac{(q-j)^2}{q^2}$  
  $\displaystyle =$ $\displaystyle 4(q-1)-\frac{8}{q}\sum_{j=1}^{q-1}j+\frac{4}{q^2}\sum_{j=1}^{q-1}j^2.$  

With the summation formulas

$\displaystyle \sum_{j=1}^{q-1}j=(q-1)q/2$

and

$\displaystyle \sum_{j=1}^{q-1}j^2=q(q-1)(2q-1)/6$

we finally obtain

$\displaystyle {\mathop{\text{\rm Var}}}_{as}\{\sqrt{n}VQ(q)\}=\frac{2(2q-1)(q-1)}{3q}.
$