14.2 Model Indepependent Tests for $ I(0)$ against $ I(d)$

A stochastic process is $ I(d)$ if it needs to be differentiated $ d$ times in order to become $ I(0)$. We shall test for $ I(0)$ against fractional alternatives by using more formal definitions.

In a first approach, we define a stochastic process $ \{Y_t\}$ as $ I(0)$ if the normalized partial sums follow a particular distribution. We only require is the existence of a consistent estimator of the variance for normalizing the partial sums. The tests presented here make use of the Newey and West (1987) heteroskedastic and autocorrelation consistent (HAC) estimator of the variance, defined as

$\displaystyle \hat{\sigma}^2_T(q) = \hat{\gamma}_0 + 2 \sum_{j=1}^q \left( 1 - \frac{j}{1+q}\right) \hat{\gamma}_j, \quad q < T,$ (14.3)

where $ \hat{\gamma}_0$ is the variance of the process, and the sequence $ \{\hat{\gamma}_j\}_{j=1}^q$ denotes the autocovariances of the process up to the order $ q$. This spectral based HAC variance estimator depends on the user chosen truncation lag $ q$. Andrews (1991) has proposed a selection rule for the order $ q$.

The quantlet 24155 neweywest computes the Newey and West (1987) estimator of the variance of a unidimensional process. Its syntax is:

  sigma = neweywest(y{, q})
where the input parameters are:
y
the series of observations
q
optional parameter, which can be either a vector of truncation lags or a single scalar
The HAC estimator is calculated for all the orders included in the parameter q. If no optional parameter is provided, the HAC estimator is evaluated for the default orders q = 5, 10, 25, 50. The estimated HAC variances are stored in the vector q.

In the following example the HAC variance of the first 2000 observations of the 20 minutes spaced sample of Deutschmark-Dollar FX is computed.

  library("times")
  y = read("dmus58.dat")
  y = y[1:2000]
  q = 5|10|25|50
  sigma = neweywest(y,q)
  q~sigma
24159 XAGlongmem01.xpl

As an output we get
  Contents of _tmp

  [1,]        5  0.0047841 
  [2,]       10  0.008743 
  [3,]       25  0.020468 
  [4,]       50  0.039466


14.2.1 Robust Rescaled Range Statistic

The first test for long-memory was devised by the hydrologist Hurst (1951) for the design of an optimal reservoir for the Nile river, of which flow regimes were persistent. Although Mandelbrot (1975) gave a formal justification for the use of this test, Lo (1991) demonstrated that this statistic was not robust to short range dependence, and proposed the following one:

$\displaystyle Q_T = \frac{1}{\hat{\sigma}_T(q)} \left[\max_{1\le k\le T} \sum_{...
...\overline{X}_T) - \min_{1\le k\le T} \sum_{j=1}^k (X_j- \overline{X}_T) \right]$ (14.4)

which consists of replacing the variance by the HAC variance estimator in the denominator of the statistic. If $ q=0$, Lo's statistic reduces to Hurst's $ R/S$ statistic. Unlike spectral analysis which detects periodic cycles in a series, the $ R/S$ analysis has been advocated by Mandelbrot for detecting nonperiodic cycles. Under the null hypothesis of no long-memory, the statistic $ T^{-\frac{1}{2}} Q_n$ converges to a distribution equal to the range of a Brownian bridge on the unit interval:

$\displaystyle \max_{0\le t \le 1} W^0(t) -\min_{0\le t \le 1} W^0(t),
$

where $ W^0(t)$ is a Brownian bridge defined as $ W^0(t) = W(t) -t W(1)$, $ W(t)$ being the standard Brownian motion. The distribution function is given in Siddiqui (1976), and is tabulated in Lo (1991).

This statistic is extremely sensitive to the order of truncation $ q$ but there is no statistical criteria for choosing $ q$ in the framework of this statistic. Andrews (1991) rule gives mixed results. If $ q$ is too small, this estimator does not account for the autocorrelation of the process, while if $ q$ is too large, it accounts for any form of autocorrelation and the power of this test tends to its size. Given that the power of a useful test should be greater than its size, this statistic is not very helpful. For that reason, Teverovsky, Taqqu and Willinger (1999) suggest to use this statistic with other tests.

Since there is no data driven guidance for the choice of this parameter, we consider the default values for $ q$ = 5, 10, 25, 50. XploRe users have the option to provide their own vector of truncation lags.

Let's consider again the series of absolute returns on the 20 minutes spaced Deutschmark-Dollar FX rates.

  library("times")
  y = read("dmus58.dat")
  ar = abs(tdiff(y[1:2000]))
  lostat = lo(ar)
  lostat
24251 XAGlongmem02.xpl

Given that we do not provide a vector of truncation lags, Lo's statistic is computed for the default truncation lags. The results are displayed in the form of a table: the first column contains the truncation orders, the second columns contains the computed statistic. If the computed statistic is outside the 95% confidence interval for no long-memory, a star $ \ast$ is displayed after that statistic.
  Contents of lostat

  [1,] " Order   Statistic"
  [2,] "__________________ "
  [3,] ""
  [4,] "    5     2.0012 *"
  [5,] "   10     1.8741 *"
  [6,] "   25     1.7490 "
  [7,] "   50     1.6839 "
This result illustrates the issue of the choice of the bandwidth parameter q. For q = 5 and 10, we reject the null hypothesis of no long-memory. However, when q = 25 or 50, this null hypothesis is accepted, as the power of this test is too low for these levels of truncation orders.


14.2.2 The KPSS Statistic

Equivalently, we can test for $ I(0)$ against fractional alternatives by using the KPSS test Kwiatkowski, Phillips, Schmidt, and Shin (1992), as Lee and Schmidt (1996) have shown that this test has a power equivalent to Lo's statistic against long-memory processes. The two KPSS statistics, denoted by $ \eta_t$ and $ \eta_\mu$, are respectively based on the residuals of two regression models: on an intercept and a trend $ t$, and on a constant $ \mu$. If we denote by $ S_t$ the partial sums $ S_t =
\sum_{i=1}^t \hat{e}_i$, where $ \hat{e}_t$ are the residuals of these regressions, the KPSS statistic is defined by:

$\displaystyle \eta = T^{-2} \sum_{t=1}^T S^2_t/ \hat{\sigma}^2_T(q)$ (14.5)

where $ \hat{\sigma}^2_T(q)$ is the HAC estimator of the variance of the residuals defined in equation (14.3). The statistic $ \eta_\mu$ tests for stationarity against a long-memory alternative, while the statistic $ \eta_t$ tests for trend-stationarity against a long-memory alternative.

The quantlet 24328 kpss computes both statistics. The default bandwidths, denoted by $ L_0$, $ L_4$ and $ L_{12}$ are the one given in Kwiatkowski, Phillips, Schmidt, and Shin (1992). We evaluate both tests on the series of absolute returns ar as follows:

  library("times")
  y = read("dmus58.dat")
  ar = abs(tdiff(y[1:2000]))
  kpsstest = kpss(ar)
  kpsstest
24332 XAGlongmem03.xpl

The quantlet 24337 kpss returns the results in the form of a table. The first column contains the truncation order, the second column contains the type of the test: const means the test for stationary sequence, while trend means the test for trend stationarity. The third column contains the computed statistic. If this statistic exceeds the 95% critical value, a $ \ast$ symbol is displayed. The last column contains this critical value.

Thus, XploRe returns the following table:

   Contents of kpsstest

  [1,] "   Order   Test   Statistic  Crit. Value "
  [2,] "_________________________________________ "
  [3,] ""
  [4,] " L0 =  0   const    1.8259 *   0.4630"
  [5,] " L4 =  8   const    1.2637 *   0.4630"
  [6,] " L12= 25   const    1.0483 *   0.4630"
  [7,] " L0 =  0   trend    0.0882    0.1460"
  [8,] " L4 =  8   trend    0.0641    0.1460"
  [9,] " L12= 25   trend    0.0577    0.1460"


14.2.3 The Rescaled Variance $ V/S$ Statistic

Giraitis, Kokoszka and Leipus (1998) have proposed a centering of the KPSS statistic based on the partial sum of the deviations from the mean. They called it a rescaled variance test $ V/S$ as its expression given by

$\displaystyle V/S = \frac{1}{T^2\hat{\sigma}^2_T(q)}\left[ \sum_{k=1}^T \left(\...
...1}{T} \left( \sum_{k=1}^T \sum_{j=1}^k (Y_j - \overline{Y}_T) \right)^2 \right]$ (14.6)

can equivalently be rewritten as

$\displaystyle V/S = T^{-1} \frac{\hat{V}(S_1,\ldots,S_T)}{\hat{\sigma}^2_T(q)},$ (14.7)

where $ S_k = \sum_{j=1}^k (Y_j - \overline{Y}_n)$ are the partial sums of the observations. The $ V/S$ statistic is the sample variance of the series of partial sums $ \{S_t\}_{t=1}^T$. The limiting distribution of this statistic is a Brownian bridge of which the distribution is linked to the Kolmogorov statistic. This statistic has uniformly higher power than the KPSS, and is less sensitive than the Lo statistic to the choice of the order $ q$. For $ 2 \le q \le 10$, the $ V/S$ statistic can appropriately detect the presence of long-memory in the levels series, although, like most tests and estimators, this test may wrongly detect the presence of long-memory in series with shifts in the levels. Giraitis, Kokoszka and Leipus (1998) have shown that this statistic can be used for the detection of long-memory in the volatility for the class of ARCH($ \infty$) processes.

We evaluate the $ V/S$ statistic with the quantlet 24435 rvlm which has the following syntax:

  vstest = rvlm(ary{, q})
where
ary
is the series
q
is a vector of truncation lags. If this optional argument is not provided, then the default vector of truncation lags is used, with q = 0, 8, 25.
This quantlet returns the results in the form of a table: the first column contains the order of truncation $ q$, the second column contains the estimated $ V/S$ statistic. If this statistic is outside the 95% confidence interval for no long-memory, a star $ \ast$ symbol is displayed. The fourth column displays the 95% critical value. Thus the instruction
  library("times")
  y = read("dmus58.dat")
  ar = abs(tdiff(y[1:2000]))
  vstest = rvlm(ar)
  vstest
24439 XAGlongmem04.xpl

returns
  Contents of vstest 

  [1,] "   Order  Statistic  Crit. Value "
  [2,] "_________________________________"
  [3,] ""
  [4,] "      0    0.3305 *   0.1869"
  [5,] "      8    0.2287 *   0.1869"
  [6,] "     25    0.1897 *   0.1869"


14.2.4 Nonparametric Test for $ I(0)$

Lobato and Robinson (1998) nonparametric test for $ I(0)$ against $ I(d)$ is also based on the approximation (14.2) of the spectrum of a long-memory process. In the univariate case, the $ t$ statistic is equal to:

$\displaystyle t = m^{1/2} \hat{C}_1/\hat{C}_0 \quad \textrm{with} \quad \hat{C}...
...a_j) \quad \textrm{and} \quad \nu_j = \ln(j) - \frac{1}{m} \sum_{i=1}^m \ln(i),$ (14.8)

where $ I(\lambda) = (2\pi T)^{-1}\vert \sum_{t=1}^T y_t e^{it\lambda} \vert^2$ is the periodogram estimated for a degenerate band of Fourier frequencies $ \lambda_j = 2 \pi j/T, j=1,\ldots,m \ll[T/2]$, where $ m$ is a bandwidth parameter. Under the null hypothesis of a $ I(0)$ time series, the $ t$ statistic is asymptotically normally distributed. This two sided test is of interest as it allows to discriminate between $ d>0$ and $ d<0$: if the $ t$ statistic is in the lower fractile of the standardized normal distribution, the series exhibits long-memory whilst if the series is in the upper fractile of that distribution, the series is antipersistent.

The quantlet 24527 lobrob evaluates the Lobato-Robinson test. Its syntax is as follows:

  l = lobrob(ary{, m})
where
ary
is the series,
m
is the vector of bandwidth parameters. If this optional argument is missing, the default bandwidth suggested by Lobato and Robinson is used.
The results are displayed in the form of a table: the first column contains the value of the bandwidth parameter while the second column displays the corresponding statistic. In the following example, the Lobato-Robinson statistic is evaluated by using this default bandwidth:
  library("times")
  y = read("dmus58.dat")
  ar = abs(tdiff(y[1:2000]))
  l = lobrob(ar)
  l
24531 XAGlongmem05.xpl

which yields
  Contents of l

  [1,] "Bandwidth   Statistic "
  [2,] "_____________________ "
  [3,] ""
  [4,] "   334      -4.4571"

In the next case, we provide a vector of bandwidths m, and evalutate this statistic for all the elements of m. The sequence of instructions:

  library("times")
  y = read("dmus58.dat")
  ar = abs(tdiff(y[1:2000]))
  m = #(100,150,200)
  l = lobrob(ar,m)
  l
24537 XAGlongmem06.xpl

returns the following table:
  Contents of l

  [1,] "Bandwidth   Statistic "
  [2,] "_____________________ "
  [3,] ""
  [4,] "   100      -1.7989"
  [5,] "   150      -2.9072"
  [6,] "   200      -3.3308"