14.2 Hurst and Rescaled Range Analysis

Hurst ($ 1900$-$ 1978$) was an English hydrologist, who worked in the early $ 20$th century on the Nile River Dam project. When designing a dam, the yearly changes in water level are of particular concern in order to adapt the dam's storage capacity according to the natural environment. Studying an Egyptian $ 847$-year record of the Nile River's overflows, Hurst observed that flood occurrences could be characterized as persistent, i.e.  heavier floods were accompanied by above average flood occurrences, while below average occurrences were followed by minor floods. In the process of this findings he developed the Rescaled Range (R/S) Analysis.

We observe a stochastic process $ Y_t$ at time points $ t\in{\cal I}=\{0,\ldots,N\}$. Let $ n$ be an integer that is small relative to $ N$, and let $ A$ denote the integer part of $ N/n$. Divide the `interval' $ {\cal I}$ into $ A$ consecutive `subintervals', each of length $ n$ and with overlapping endpoints. In every subinterval correct the original datum $ Y_t$ for location, using the mean slope of the process in the subinterval, obtaining $ Y_t-(t/n)\,(Y_{an}-Y_{(a-1)n})$ for all $ t$ with $ (a-1)n \leq t \leq an$ and for all $ a = 1, \ldots , A$. Over the $ a$'th subinterval $ {\cal I}_a=\{(a-1)n,(a-1)n+1,\ldots,an\}$, for $ 1\leq a\leq A$, construct the smallest box (with sides parallel to the coordinate axes) such that the box contains all the fluctuations of $ Y_t-(t/n)\,(Y_{an}-Y_{(a-1)n})$ that occur within $ {\cal I}_a$. Then, the height of the box equals

$\displaystyle R_a$ $\displaystyle =$ $\displaystyle \max_{(a-1)n\leq t\leq an}
\left\{Y_{t}-\frac{t}{n}(Y_{an}-Y_{(a-1)n})\right\}$  
  $\displaystyle -$ $\displaystyle \min_{(a-1)n\leq t\leq an}
\left\{Y_{t}-\frac{t}{n}(Y_{an}-Y_{(a-1)n})\right\}$  

Figure 14.1 illustrates the procedure.

Figure 14.1: The construction of the boxes in the R/S analysis.
\includegraphics[width=1.2\defpicwidth]{RSplot.ps}

Let $ S_a$ denote the empirical standard error of the $ n$ variables $ Y_t-Y_{t-1}$, for $ (a-1)n+1\leq t\leq an$. If the process $ Y$ is stationary then $ S_a$ varies little with $ a$; in other cases, dividing $ R_a$ by $ S_a$ corrects for the main effects of scale inhomogeneity in both spatial and temporal domains.

The total area of the boxes, corrected for scale, is proportional in $ n$ to

$\displaystyle \Big({R\over S}\Big)_{\!\!n}:=A^{-1}\,\sum_{a=1}^A\,{R_a\over S_a}\,.\eqno(2.1)
$

The slope $ {\hat H}$ of the regression of $ \log(R/S)_n$ on $ \log n$, for $ k$ values of $ n$, may be taken as an estimator of the Hurst constant $ H$ describing long-range dependence of the process $ Y$, Beran (1994) and Peters (1994).

If the process $ Y$ is stationary then correction for scale is not strictly necessary, and we may take each $ S_a$ to be the constant 1. In that case the R-S statistic $ {\hat H}$ is a version of the box-counting estimator that is widely used in physical science applications, Carter et al. (1988), Sullivan and Hunt (1988) and Hunt (1990). The box-counting estimator is related to the capacity definition of fractal dimension, Barnsley (1988) p. 172ff, and the R-S estimator may be interpreted in the same way. Statistical properties of the box-counting estimator have been discussed by Hall and Wood (1993).

A more detailed analysis, exploiting dependence among the errors in the regression of $ \log(R/S)_n$ on $ \log n$, may be undertaken in place of R-S analysis. See Kent and Wood (1997) for a version of this approach in the case where scale correction is unnecessary. However, as Kent and Wood show, the advantages of the approach tend to be asymptotic in character, and sample sizes may need to be extremely large before real improvements are obtained.

Hurst used the coefficient $ H$ as an index for the persistence of the time series considered. For $ 0.5<H<1$, it is positively persistent and characterized by `long memory' effects, as described in the next section. A rather informal interpretation of $ H$ used by practitioners is this: $ H$ may be interpreted as the chance of movements with the same sign, Peters (1994). For $ H>0.5$, it is more likely that an upward movement is followed by a movement of the same (positive) sign, and a downward movement is more likely to be followed by another downward movement. For $ H<0.5$, a downward movement is more likely to be reversed by an upward movement thus implying the reverting behavior.