12.5 Unit Root Tests for Panel Data


output = 22866 panunit (z, m, p, d{, T})
computes various panel unit root statistics for the m-th variable in the data set z with p lagged and different deterministic term indicated by d.

In the previous sections we implicitely assumed that real exchange rates $ s_{it}$ are difference stationary variables and the real interest rates are stationary in levels. This assumption is also made by MacDonald and Nagayasu (1999), for example. In the recent literature of dynamic panel data tests have been suggested to test such hypotheses. Following Dickey and Fuller (1979), the unit root hypothesis can be tested by performing the regression:

$\displaystyle \Delta y_{it} = \mu_i + \beta_i t + \varrho_i y_{i,t-1} + \alpha_{i1} \Delta y_{i,t-1} + \cdots + \alpha_{ip} \Delta y_{i,t-p} + u_{it}$ (12.9)

and testing $ \varrho_i =0$ for all $ i$. The test procedure of Breitung and Meyer (1994) assumes that $ \beta_1 = \cdots =
\beta_N$ and $ \alpha_{1j} = \cdots = \alpha_{Nj}$ for $ j=1,\ldots,N$. Thus, as in traditional panel data models, heterogeneity is represented solely by an individual specific intercept. Under this simplifying assumptions a pooled autoregression can be estimated and the lag order can be chosen with respect to the highest significant lag. Therefore, we propose to select the lag order of the model by testing the last lag in the autoregressive specification. This procedure is equivalent to select the lag order by using the highest significant partial autocorrelation. Similarly, the deterministic terms can be selected.

A simple test for the unit root hypothesis is obtained by running the regression

$\displaystyle \Delta y_{it} = \beta t + \varrho (y_{i,t-1}-y_{i1}) + \alpha_1 \Delta y_{i,t-1} + \cdots + \alpha_p \Delta y_{i,t-p} + u_{it}^*$ (12.10)

The $ t$-statistic for $ H_0:$ $ \varrho=0$ is asymptotically standard normal as $ N\to \infty$. Since this procedure is also valid for small $ T$ (e.g. $ T=10$), this test called BM in the panunit output is recommended if only a small number of time periods is available.

Levin and Lin (1993) extend the test procedure to individual specific time trends and short run dynamics. At the first stage the individual specific parameters are ``partialled out'' by forming the residuals $ e_{it}$ and $ v_{i,t-1}$ from a regression of $ \Delta y_{it}$ and $ y_{i,t-1}$ on the deterministics and the lagged differences. To account for heteroskedasticity the residuals are adjusted for their standard deviations yielding $ \tilde e_{it}$ and $ \tilde v_{i,t-1}$. The final regression is of the form

$\displaystyle \tilde e_{it} = \varrho \tilde v_{i,t-1} + \nu_{it}
$

If there are no deterministics in the first-stage regressions, the resulting $ t$-statistic for the hypothesis $ \varrho=0$ is asymptotically standard normally distributed as $ T\to \infty$ and $ N\to \infty$. However, if there is a constant or a time trend in the model, then second-stage $ t$-statistic tends to infinity as $ T\to \infty$, even if the null hypothesis is true. Levin and Lin (1993) suggest a correction of the $ t$-statistic to remove the bias and to obtain an asymptotic standard normal distribution for the test statistic. Monte Carlo simulations indicate that the test may perform poorly for $ p > 1$ and, thus, the test should only be used for $ p=0$ or $ p = 1$ (see Im, Pesaran, and Shin (1997) and Breitung (1999)).

Another way to deal with the bias problem of the $ t$-statistic is to adopt a different adjustment for the constant and the time trend. The resulting test statistics are called the unbiased Levin-Lin statistic. In the model with a constant term only, the constant can be removed by using $ (y_{i,t-1}-y_{i1})$ instead of $ y_{i,t-1}$. The first stage regression only uses the lagged differences as regressors. At the second stage, the regression is

$\displaystyle \tilde e_{it} = \varrho (\tilde v_{i,t-1}-\tilde v_{i1}) +
\nu_{it}
$

and the resulting $ t$-statistic for $ \varrho=0$ is asymptotically standard normal as $ T\to \infty$ and $ N\to \infty$. If there is a linear trend in the model the nuisance parameters are removed by estimating the current trend value by using past values of the process only. Accordingly, the series are adjusted according to
$\displaystyle \tilde e_{it}^*$ $\displaystyle =$ $\displaystyle \tilde e_{it} - \tilde e_{i1}$  
$\displaystyle \tilde
v_{i,t-1}^*$ $\displaystyle =$ $\displaystyle \tilde v_{i,t-1} -
\frac{t-1}{T} \tilde v_{iT}$  

Note that $ T^{-1} \tilde v_{iT}=T^{-1} \sum_{t=1}^T \Delta \tilde v_{iT}$ is an estimate of the drift parameter. Again, the resulting modification yields a $ t$-statistic with a standard normal limiting distribution Breitung (1999).

Im, Pesaran, and Shin (1997) further extended the test procedure by allowing for different values of $ \varrho_i$ under the alternative. Accordingly, all parameters were estimated separately for the cross-section units. Let $ \tau_i$ $ (i=1,\ldots,N)$ denote the individual $ t$-statistic for the hypothesis $ \varrho_i =0$. As $ T\to \infty$ and $ N\to \infty$ we have

$\displaystyle N^{-1/2} \sum_{i=1}^N (\tau_i -\mu_\tau)/s_\tau \ {\buildrel d
\over \longrightarrow} \ {\cal N} (0,1),
$

where $ \mu_\tau$ and $ s_\tau$ is the expectation and standard deviation of the statistic $ \tau_i$. Im, Pesaran, and Shin (1997) present the mean and variances for a wide range of $ T$ and $ p$. The quantlet 22869 panunit uses estimated values for $ \mu_\tau$ and $ s_\tau$ that are obtained from regressions on $ \sqrt T$, $ T$, and $ T^2$ and $ p$.

Generally, the quantlet computing all these unit root statistics is called as follows:

  output = panunit(z, m, p, d{, T})
The parameters necessary for computing the statistics are as follows. The parameter m indicates the column number of the variable to be tested for a unit root. The parameter p indicates the number of lagged differences in the model. The parameter d indicates the kind of deterministics used in the regressions. A value of d=0 implies that there is no deterministic term in the model. If d=1, a constant term is included and for d=2 a linear time trend is included. Finally, if a balanced panel data set is used, the common time period T is given.

In our application, we test for unit roots in the interest rate differential. The unit root tests for the long-term interest spread (second variable) including a constant and a single lagged difference are obtained using the command

  panunit(z, 2, 1, 1)
The results can be found in the output table:

  [ 1,] "====================================================="
  [ 2,] "Pooled Dickey-Fuller Regression:  2'th variable      "
  [ 3,] "====================================================="
  [ 4,] "PARAMETERS        Estimate     robust SE      t-value"
  [ 5,] "====================================================="
  [ 6,] "Lag[1]=            -0.2696       0.0296        -9.117"
  [ 7,] "Delta[ 1]=         -0.0863       0.0551        -1.566"
  [ 8,] "const=              0.1109       0.1137         0.976"
  [ 9,] "====================================================="
  [10,] "N*T=    378      N=   16           With constant     "
  [11,] "Unit root statistics:                                "
  [12,] "STATISTIC      Value  crit. Value (5%)  mean variance"
  [13,] "====================================================="
  [14,] "B/M (1994)     -2.453     -1.65       0.000     1.000"
  [15,] "L/L (1993)     -3.563     -1.65      -0.560     0.856"
  [16,] "mod. L/L       -3.711     -1.65       0.000     1.000"
  [17,] "I/P/S (1997)   -5.313     -1.65      -1.493     0.756"
  [18,] "====================================================="
All four unit root tests clearly reject the hypotheses of a unit root in the long-term interest spread. Similar result are obtained for the short-term interest differential (not reported). These results are in line with macroeconomic theory on the international term structure.