Let
be a strictly stationary
time series, as defined in Definition 10.6, that is the
distribution of the data and its probability structure does not
change over time. Each single observation
has, among other
things, the same distribution function
. To compare consider
the i.i.d. random variables
with the same
distribution
. Let
be the maxima of
values from the
time series respectively from
independent observations. A
simple but basic relationship for the previous sections is
(17.1), i.e.,
![]() |
|||
if and only if | ![]() |
Pure white noise automatically has the extremal index
since
here are independent. It is not obvious that all
ARMA
processes (see Chapter 11) with normally
distributed innovations also have an extremal index
,
its maxima thus behave like maxima from independent data.
Intuitively this comes from, on the one hand, ARMA
processes having an exponentially decreasing
memory, i.e., the observations
are for
sufficiently large time periods
practically independent,
and, on the other hand, the probability of two extreme
observations occurring within the same time interval, which is not
too long, is small. These qualitative statements can be formulated
as two precise criteria of time series that have an extremal index
of 1, the exact formulation of which will not be given here.
For financial time series models the second condition is not
fulfilled, because they contradict the presence of volatility
clusters (see Chapter 12), i.e., the local frequency
of extreme observations. The extremal index of an ARCH(1)
process with parameters
(see Definition 12.1) is, for example, always
. It can be approximated for
,
for example,
Finally note that not every time series has an extremal index. A
simple counter example is
with i.i.d. random
variables
, which are modelled by a random factor
that is independent of
. Since the factor
is contained in
all observations, even in the most distant past, this time series
has no decreasing memory. If the distribution of
has slowly
decaying tails, i.e., they belong to the MDA of a Fréchet
distribution, then it can be shown that
can not have an
extremal index.
The extreme theory for time series is still developing. The Fisher-Tippett theorem, however, exists as a central result in the following modified form:
![]() |
|
if and only if |
![]() |
for all with
The maxima of the time series are standardized by the same series
and converge in distribution to the same type of
asymptotic distribution as the maxima of the corresponding
independent data, since
is itself a general
extreme value distribution with the same form parameters as
. For example, for
it holds that
Many of the techniques used in extreme value statistics, that were
developed for independent data can be used on time series. To do
this, however, one needs to have more data, because the effective
size of the sample is only instead of
. Besides that,
additional problems appear: the POT method is perhaps in theory
still applicable, but the excesses are no longer independent,
especially when a financial time series with volatility clusters
is considered. For this reason the parameters of the generalized
Pareto distribution, with which the excess distribution is
approximated, cannot be estimated by simply taking the maximum of
the likelihood function of independent data. One way out of this
is to either use special model assumptions, with which the
likelihood function of the dependent excesses can be calculated,
or by using a reduction technique, with which the data is made
more "independent" at the cost of the sample size. One
application, for example, replaces the cluster of neighboring
excesses with a maximum value from the cluster, whereby the
cluster size is so chosen that the sample size of the excesses is
approximately reduced by the factor
. Afterwards the POT
estimators, which were developed for independent data, can be
calculated from the reduced excesses.
Another problem is that the extremal index needs to be estimated
in order to be able to use applications like the one just
described. In the literature several estimation techniques are
described. We will introduce only one here; one that can be
described without a lot of technical preparation, the so called
Block method. First the time series data
is divided into
blocks, each has a length
(size
large). Let
be the
maximum of the observations in the
-th block:
![]() |
![]() |
![]() |
|
![]() |
![]() |