12.6 Unit Root Tests for Panel Data


output = 30694 panunit (z, m, p, d {,T})
computes various panel unit root statistics for the mth variable in the data set z with p lagged and different deterministic term indicated by d

Using panel data, powerful tests for a unit root in the autoregressive representation of the series can be constructed. Following Dickey and Fuller (1979), the unit root hypothesis can be tested by performing the regression:

$\displaystyle \Delta y_{it} = \mu_i + \beta_i t + \varrho_i y_{i,t-1} + \alpha_{i1} \Delta y_{i,t-1} + \cdots + \alpha_{ip} \Delta y_{i,t-p} + u_{it}$ (12.25)

and testing $ \varrho_i =0$ for all $ i$. For the test procedure of Breitung and Meyer (1994) it is assumed that $ \beta_1 = \cdots = \beta_N$ and $ \alpha_{1j} = \cdots = \alpha_{Nj}$ for $ j=1,\ldots,N$. Thus, as in traditional panel data models, heterogeneity is represented by an individual specific intercept. A simple test is obtained by using $ \widehat \mu_i = y_{i1}$ which is the best estimator given that the null hypothesis is true. The resulting test regression is

$\displaystyle \Delta y_{it} = \beta t + \varrho (y_{i,t-1}-y_{i1}) + \alpha_1 \Delta y_{i,t-1} + \cdots + \alpha_p \Delta y_{i,t-p} + u_{it}^*$ (12.26)

and the OLS regression yields an asymptotically normally distributed $ t$-statistic as $ N\to \infty$. Note that this test procedure is asymptotically valid for fixed $ T$.

Levin and Lin (1993) extend the test procedure to individual specific time trends and short run dynamics. At the first stage the individual specific parameters are ``partialled out'' by forming the residuals $ e_{it}$ and $ v_{i,t-1}$ from a regression of $ \Delta y_{it}$ and $ y_{i,t-1}$ on the deterministics and the lagged differences. To account for heteroscedasticity the residuals are adjusted for their standard deviations yielding $ \widetilde e_{it}$ and $ \widetilde v_{i,t-1}$. The final regression is of the form

$\displaystyle \widetilde e_{it} = \varrho \widetilde v_{i,t-1} + \nu_{it}.
$

If there are no deterministics in the first-stage regressions, the resulting $ t$-statistic for the hypothesis $ \varrho=0$ is asymptotically standard normally distributed as $ T\to \infty$ and $ N\to \infty$. However, if there is a constant or a time trend in the model, then second-stage $ t$-statistic tends to infinity as $ T\to \infty$, even if the null hypothesis is true. Levin and Lin (1993) suggest a correction of the $ t$-statistic to remove the bias and to obtain an asymptotic standard normal distribution for the test statistic.

Another way to deal with the bias problem of the $ t$-statistic is to adopt a different adjustment for the constant and the time trend. The resulting test statistics are called the modified Levin-Lin statistic. In the model with a constant term only, the constant can be removed by using $ (y_{i,t-1}-y_{i1})$ instead of $ y_{i,t-1}$. The first stage regression only uses the lagged differences as regressors. At the second stage, the regression is

$\displaystyle \widetilde e_{it} = \varrho (\widetilde v_{i,t-1}-\widetilde v_{i1}) + \nu_{it}
$

and the resulting $ t$-statistic for $ \varrho=0$ is asymptotically standard normal as $ T\to \infty$ and $ N\to \infty$. If there is a linear trend in the model the nuisance parameters are removed by estimating the current trend value by using past values of the process only. Accordingly, the series are adjusting according to
$\displaystyle \widetilde e_{it}^*$ $\displaystyle =$ $\displaystyle \widetilde e_{it} - \widetilde e_{i1},$  
$\displaystyle \widetilde v_{i,t-1}^*$ $\displaystyle =$ $\displaystyle \widetilde v_{i,t-1} - \widetilde v_{i1} - \frac{t+1}{t-2}
\sum_{s=2}^{t-1} \Delta \widetilde v_{is}\,.$  

Again, the resulting modification yields a $ t$-statistic with a standard normal limiting distribution.

Im, Pesaran, and Shin (1997) further extended the test procedure by allowing for different values of $ \varrho_i$ under the alternative. Accordingly, all parameters were estimated separately for the cross-section units. Let $ \tau_i$ denote the individual $ t$-statistic for the hypothesis $ \varrho_i =0$. As $ T\to \infty$ and $ N\to \infty$, we have

$\displaystyle \sqrt N \bar \tau = N^{-1/2} \sum_{i=1}^N \tau_i \ {\buildrel d \over
\longrightarrow} \ {\cal N} (\mu_\tau , v_\tau),
$

where $ \mu_\tau$ and $ v_\tau$ is the expectation and variance of the statistic $ \tau_i$. Im, Pesaran, and Shin (1997) present the mean and variances for a wide range of $ T$ and $ p$. These values are interpolated by regression functions on $ \sqrt T$, $ T$, and $ T^2$ and $ p$.

The quantlet computing all these unit root statistics is called

  output = panunit(z,m,p,d {,T})
The parameters necessary for computing the statistics are as follows. The parameter m indicates the number of variable in the data set which is to be tested for a unit root. The parameter p indicates the number of lagged differences in the model. The parameter d indicates the kind of deterministics used in the regressions. A value of d=0 implies that there is no deterministic term in the model. If d=1, a constant term is included and for d=2 a linear time trend is included. Finally, if a balanced panel data set is used, the common time period T is given. For example, assume that the second variable in a balanced data set with T=32 is to be tested including a constant and a lagged difference. Then the respective command is
  output = panunit(z,2,1,1,32)
The string output first gives an output table of a pooled Dickey-Fuller regression. In a second table, the four unit root statistics are presented.