An adaptive estimation algorithm for time series is presented in this chapter. The basic idea is the following: given a time series and a linear model, we select on-line the largest sample of the most recent observations, such that the model is not rejected. Assume for example that the data can be well fitted by a regression, an autoregression or even by a constant in an unknown interval. The main problem is then to detect the time interval where the model approximately holds. We call such an interval: interval of time homogeneity.
This approach appears to be suitable in financial econometrics, where an on-line analysis of large data sets, like e.g. in backtesting, has to be performed. In this case, as soon as a new observation becomes available, the model is checked, the sample size is optimally adapted and a revised forecast is produced.
In the remainder of the chapter we briefly present the theoretical foundations of the proposed algorithm which are due to Liptser and Spokoiny (1999) and we describe its implementation. Then, we provide two applications to financial data. In the first one we estimate the possibly time varying coefficients of an exchange rate basket, while in the second one the volatility of an exchange rate time series is fitted to a locally constant model. The main references can be found in Härdle et al. (2001), Mercurio and Spokoiny (2000), Härdle et al. (2000) and Mercurio and Torricelli (2001).
Let us consider the following linear regression equation:
![]() |
![]() |
![]() |
|
![]() |
![]() |
||
![]() |
![]() |
We now describe how the bound (15.4) can be used in order to
estimate the coefficients in the regression equation (15.1) when
the regressors are (possibly) stochastic and the coefficients are not constant,
but follow a jump process.
The procedure that we describe does not require an explicit expression of the law of the
process , but it only assumes that
is constant on
some unknown time interval
. This interval
is referred as an interval of time homogeneity and a model which is constant
only on some time interval is called locally time homogeneous.
Let us now define some notation. The expression
will describe the
(filtering) estimator of the process
at time
;
that is to say, the estimator which uses only observations up to time
. For example
if
is constant, the recursive estimator of the form:
Suppose that is an
interval-candidate, that is, we expect time-homogeneity in
and
hence in every subinterval
.
This implies that the mean values of the
and
nearly coincide.
Furthermore, we know on the basis of equation (15.4)
that the events
Now we present a formal description. Suppose that a family
of
interval candidates
is
fixed. Each of them is of the form
, so
that the set
is ordered
due to
. With every such interval we associate an estimate
of the parameter
and the corresponding conditional
standard deviation
.
Next, for every interval
from
, we suppose to be given a set
of testing subintervals
. For every
, we construct the corresponding estimate
from the observations for
and compute
.
Now, with two constants
and
, define the adaptive
choice of the interval of
homogeneity by the following iterative procedure:
As for the variance estimation, note that the previously described
procedure requires the knowledge of the variance
of the errors.
In practical applications,
is typically unknown
and has to be estimated
from the data. The regression representation (15.1) and local
time homogeneity
suggests to apply a residual-based estimator. Given an interval
,
we construct the parameter estimate
. Next the pseudo-residuals
are defined
as
. Finally the variance
estimator is defined by averaging the squared pseudo-residuals:
![]() |
The performance of the adaptive estimator is evaluated with data from the following process:
![]() |
In order to implement the procedure we need two parameters: and
,
and two sets of intervals:
and
.
As far as the latter are concerned the simplest proposal is
to use a regular grid
with
for
some integer
and with
belonging to the
grid.
We next consider the intervals
for
all
.
Every interval
contains exactly
smaller intervals
.
So that for every interval
and
we define the set
of testing subintervals
by taking all smaller intervals with right
end point
:
and all smaller intervals with left end point
:
:
We are now left with the choice of three parameters: ,
and
.
These parameters act as the smoothing parameters in the classical
nonparametric estimation.
The value of
determines the number of points at which
the time homogeneity is tested and it defines the minimal delay after which a jump can be
discovered. Simulation results have shown that small changes of
do not essentially
affect the results of the estimation and, depending on the number of parameters to be estimated,
it can be set between 10 and 50.
The choice of and
is more critical because these parameters determine
the acceptance or the rejection of the interval of time homogeneity as it can be seen
from equation (15.6). Large values of
and
reduce the sensitivity
of the algorithm and may delay the detection of the change point, while small values
make the procedure more sensitive to small changes in the values of the estimated
parameters and may increase the probability of a type-I error.
For the simulation, we set: ,
and
, while a rule for the
selection of
and
for real application will be discussed in the next section.
Figure 15.2 shows the results of the simulation.
The true value of the coefficients is plotted
(
: first row,
: second row,
: third row) along with
the median, the maximum
and the minimum of the estimates from all realizations for each model at
each time point.
![]() |