The Bootstrap for dependent data is a lively research area. A lot of ideas are around and have let to quite different proposals. In this section we do not want to give a detailed overview and description of the different proposals. We only want to sketch the main ideas. Models for dependent data may principally differ from i.i.d. models. For dependent data the data generating process is often not fully specified. Then there exists no unique natural way for resampling. The resampling should be carried out in such a way that the dependence structure should be captured. This can be easily done in case of classical finite-dimensional ARMA models with i.i.d. residuals. In these models the resamples can be generated by fitting the parameters and by using i.i.d. residuals in the resampling. We will discuss the situation when no finite-dimensional model is assumed. For other overviews on the bootstrap for time series analysis, see [12,30,70] and the time series chapter in  and the book . In particular,  give an overview over the higher order performance of the different resampling schemes.
The most popular bootstrap methods for dependent data are block, sieve, local, wild and Markov bootstrap and subsampling. They all are nonparametric procedures.
The method that works under a minimal amount of assumptions is the subsampling. It is used to approximate the distribution of an estimate estimating an unknown parameter . In the subsampling subsamples of consecutive observations of length are taken. These subsamples are drawn randomly from the whole time series. For the subsamples estimates are calculated. If it is known that for a sequence the statistic has a limiting distribution then under very weak conditions the conditional distribution of has the same limiting distribution. Higher order considerations show that the subsampling has a very poor rate of convergence, see . It does not even achieve the rate of convergence of a normal approximation. It may be argued that this poor performance is the price for its quite universal applicability. Subsampling has also been used in i.i.d. settings where classical bootstrap does not work. For a detailed discussion of the subsampling see .
The basic idea of the block bootstrap is closely related to the i.i.d. nonparametric bootstrap. Both procedures are based on drawing observations with replacement. In the block bootstrap however instead of single observations blocks of consecutive observations are drawn. This is done to capture the dependence structure of neighbored observations. Different versions of this idea have been proposed in [31,13,45,54] and . It has been shown that this approach works for a large class of stationary processes. The blocks of consecutive observations are drawn with replacement from a set of blocks. In the first proposal this was done for a set of nonoverlapping blocks of fixed length : , Later papers proposed to use all (also overlapping) blocks of length , i.e. the -th block consists of the observations (Moving block bootstrap). The bootstrap resample is obtained by sampling blocks randomly with replacement and putting them together to a time series of length . By construction, the bootstrap time series has a nonstationary (conditional) distribution. The resample becomes stationary if the block length is random and generated from a geometric distribution. This version of the block bootstrap is called the stationary bootstrap and was introduced in . Recently, [65,67] proposed another modification that uses tapering methods to smooth the effects of boundaries between neighbored blocks. With respect to higher order properties the moving block bootstrap outperforms the version with non overlapping blocks and both achieve a higher order accuracy as the stationary bootstrap (see [34,46,47,49]).
The block bootstrap has turned out as a very powerful method for dependent data. It does not achieve the accuracy of the bootstrap for i.i.d. data but it outperforms the subsampling. It works reasonably well under very weak conditions on the dependency structure. It has been applied to a very broad range of applications. For the block bootstrap no specific assumption is made on the structure of the data generating process.
We now describe some methods that use more specific assumptions on the dependency structure.
The i.i.d. resampling can also be applied to models of dependent data where the stochastics is driven by i.i.d. innovations. The distribution of the innovations can be estimated by using fitted residuals. In the resampling i.i.d. innovations can be generated by i.i.d. resampling from this fitted distribution. An example is an autoregressive linear model:
In a series of papers this approach has been studied for the case that model (2.3) only approximately holds. This is the case if the underlying time series is a stationary linear process, i.e. has an infinite order autoregressive representation:
Another residual based bootstrap scheme has been proposed for a nonparametric autoregression model:
 also consider two other bootstrap procedures for the model (2.6): the regression bootstrap and the wild bootstrap. In the regression bootstrap, a nonparametric regression model is generated with (conditionally) fixed design. We describe this approach for the case of a homoscedasstic autoregression model:
Modifications of the regression bootstrap are the local bootstrap () and the wild bootstrap. The wild bootstrap also uses a regression model with (conditionally) fixed covariables. But it is designed to work also for heteroscedastic errors. It has been first proposed for regression models with independent but not identically distributed error variables, see [78,1]. For nonparametric models it was first proposed in . In the nonparametric autoregression model (2.7) wild bootstrap resamples are generated as in (2.8). But now the error variables are generated as where are centered fitted residuals and where are (conditionally) i.i.d. variables with conditional zero mean and conditional unit variance (given the original sample). For achieving higher order accuracy it has also been proposed to use with conditional third moment equal to . One could argue that in this resampling scheme the distribution of is fitted by the conditional distribution of . Then different distributions are fitted in a model where only observations are available. This is the reason why in  this approach was called wild bootstrap. For a more detailed discussion of the wild bootstrap, see [52,53,56,57,58]. The asymptotic analysis of the wild bootstrap and other regression type bootstrap methods in model (2.7) is much simpler than the autoregression bootstrap. In the bootstrap world it only requires mathematical analysis of a nonparametric regression model. Only the discussion of uniform nonparametric confidence bands remains rather complicated because it involves strong approximations of the bootstrap nonparametric regression estimates by Gaussian processes, see . The wild bootstrap works under quite weak model assumptions. Essentially it is only assumed that the conditional expectation of an observation given the past is a smooth function of the last observations (for some finite ). Generality has its price. Resampling schemes that use more detailed modeling may achieve a better accuracy. We now consider resampling under the stronger assumption that not only the mean but also the whole conditional distribution of an observation smoothly depends on the last observations (for some finite ). Resampling schemes that work under this smooth Markov assumption are the Markov Bootstrap schemes.
We discuss the Markov bootstrap for a Markov model of order . We will describe two implementations of the Markov bootstrap. For both implementations one has to assume that the conditional distribution of given smoothly depends on . The first version was introduced by . It is based on a nonparametric estimate of the transition density of given . Using kernel density estimates of the density of and of the joint density of one can estimate by
The second version of the Markov bootstrap can be described as a limiting version of the latter for . Then in the limiting case the bootstrap process takes values only in the set of observations . Given , the next observation is equal to with probability . This resampling scheme was introduced in [66,68]. Higher order properties are not yet known. It may be expected that it has similar asymptotic properties as the smoothed version of the Markov bootstrap. The unsmoothed version has the advantage that the bootstrap time series is forced to live on the observed values of the original time series. This leads to a more stable dynamic of the bootstrap time series, in particular for smaller sample sizes. Furthermore, for higher dimensional Markov processes the unsmoothed version is based on only dimensional kernel density smoothing whereas smoothed bootstrap requires dimensional kernel smoothing. Here, denotes the dimension of the Markov process. Again, one can argue that this leads to a more stable finite sample performance of unsmoothed bootstrap. On the other hand, the smoothed Markov bootstrap takes advantage of smoothness of with respect to . For larger data sets this may lead to improvements, in case of smooth transition densities.
For the periodogram it is known that its values for , are asymptotically independent. For the first two moments one gets that for
In this resampling bootstrap values of the periodogram are generated. The resampling uses two estimates and of the spectral density. In some implementations these estimates can be chosen identically. The first estimate is used for fitting residuals . The bootstrap residuals are drawn with replacement from the centered fitted residuals where is the average of over . The bootstrap periodogram is then calculated by putting .
The frequency domain bootstrap can be used to estimate the distribution of statistics . Then the distribution of is estimated by the conditional distribution of . Unfortunately, in general this approach does not work. This can be easily seen by a comparison of the asymptotic variances of the statistics. The original statistic has variance that is asymptotically equivalent to
Although the frequency domain bootstrap does not work in general, there exist three important examples where it works. In all three examples the second term in the asymptotic expansion of the variance vanishes. This happens e.g. if the kurtosis of the innovations is equal to zero:
There exists also another example where the frequency domain bootstrap works. Nonparametric smoothing estimates of the spectral density are linear statistics where the weights are now local. For example for kernel smoothing weights with bandwidth and kernel function one has . On the other hand, is of lower order. Now, both the variance of the original spectral density estimate and the variance of the bootstrap spectral density estimate have variance that is up to terms of order is equal to the same quantity . The correlation between and for (see (2.11)) only contributes to higher order terms.  firstly observed this relation and used this fact to show that the frequency domain bootstrap works for nonparametric spectral density estimation. In their approach, both and are nonparametric kernel smoothing estimates. For a bandwidth has been chosen that is of larger order than the bandwidth . Then bootstrap consistently estimates the bias of the spectral density estimate. Similar approaches have been used in bootstrap schemes for other settings of nonparametric curve estimation, see . For the frequency domain bootstrap for parametric problems one can choose , see .
We now have discussed a large class of resampling schemes for dependent data. They are designed for different assumptions on the dependency structure ranging from quite general stationarity assumptions (subsampling), mixture conditions (block bootstrap), linearity assumptions (sieve bootstrap, frequency domain bootstrap), conditional mean Markov property (wild bootstrap), Markov properties (Markov bootstrap) and autoregressive structure (autoregressive bootstrap). It may be generally conjectured that resampling schemes for more restrictive models are more accurate as long as these more restrictive assumptions really apply. These conjectures are supported by asymptotic results based on higher order Edgeworth expansions. (Although these results should be interpreted with care because of the poor performance of higher order Edgeworth expansions for finite samples, see also the discussion in the introduction.) The situation is also complicated by the fact that in time series analysis typically models are used as approximations to the truth and they are not interpreted as true models. Thus one has to study the much more difficult problem how resampling schemes perform if the underlying assumptions are only approximately fulfilled.
Resampling for dependent data has stimulated very creative ideas and discussions and it had lead to a large range of different approaches. Partially, the resampling structure is quite different from the stochastic structure of the original time series. In the regression bootstrap regression data are used instead of autoregression series. In the sieve bootstrap and in the frequency domain bootstrap models are used that only approximate the original model.
For dependent data the bootstrap has broadened the field of possible statistical applications. The bootstrap offered new ways of implementing statistical procedures and made it possible to treat new types of applied problems by statistical inference.
The discussion of the bootstrap for dependent data is not yet finished. For the comparison of the proposed resampling schemes a complete understanding is still missing and theoretical research is still going on. Applications of time series analysis will also require new approaches. Examples are unit root tests, cointegration analysis and the modeling of financial time series.