The basic model we consider is a parameterisation of the
conditional variance of the time series based on the order one lag
on squared past recent perturbations. For the sake of simplicity,
we assume that the time series has no structure in the
mean, is conditionally gaussian and, furthermore, that the
conditional variance is time dependent.
This basic ARCH(1) process can be formulated as a linear model on
squared perturbations. Let
, so that the square error
can be written as
This ARCH process can be included as the innovation model of several other
linear models (ARMA models, regression models, ).
The derivation of the unconditional moments of the ARCH(1) process
is possible through extensive use of the law of iterated
expectations on conditional distributions.
Then, the
following expressions are satisfied:
![]() |
![]() |
![]() |
The difference between the conditional and the unconditional
variance is a simple function of the deviation of squared
innovations from their mean. Let
, in
the ARCH(1) model with
. Then the variance of the current
error
,
conditioned on the realised values of the lagged errors
,
is an increasing function of the magnitude of the lagged errors,
irrespective of their signs. Hence, large errors of either sign
tend to be followed by a large error of either sign, similarly,
small errors of either sign tend to be followed by a small error
of either sign.
The nature of the unconditional density of an ARCH(1) process can be analysed by the higher order moments. Indeed,
Applying once again the law of iterated expectations, we have
Assuming that the process is stationary both in variance and in
the fourth moment, if
,
Simple algebra then reveals that the kurtosis is
Hence, the unconditional
distribution of is leptokurtic. That is to say, the
ARCH(1) process has tails heavier than the normal distribution.
This property makes the ARCH process attractive because the
distributions of asset returns frequently display tails heavier
than the normal distribution.
The quantlet
XEGarch05
generates an ARCH(1) series with
unconditional variance equal to 1 and obtain the basic descriptive
statistics.
[ 1,] " " [ 2,] "=========================================================" [ 3,] " Variable 1" [ 4,] "=========================================================" [ 5,] " " [ 6,] " Mean 0.0675013" [ 7,] " Std.Error 0.987465 Variance 0.975087" [ 8,] " " [ 9,] " Minimum -4.59634 Maximum 4.19141" [10,] " Range 8.78775" [11,] " " [12,] " Lowest cases Highest cases " [13,] " 278: -4.59634 49: 2.69931" [14,] " 383: -3.34884 442: 2.76556" [15,] " 400: -3.33363 399: 3.69674" [16,] " 226: -3.2339 279: 4.17015" [17,] " 40: -2.82524 287: 4.19141" [18,] " " [19,] " Median 0.0871746" [20,] " 25% Quartile -0.506585 75% Quartile 0.675945" [21,] " " [22,] " Skewness -0.123027 Kurtosis 8.53126" [23,] " " [24,] " Observations 500" [25,] " Distinct observations 500" [26,] " " [27,] " Total number of {-Inf,Inf,NaN} 0" [28,] " " [29,] "=========================================================" [30,] " "
We can see in the corresponding output that the unconditional standard error is not far from one. However, we can also observe a higher kurtosis and a wider range than we expect from a standardised white noise gaussian model.
The process
is generated by an ARCH(1) process
described in equations (6.1), where
is the total
sample size. Although the process define by (6.1) has
all observations conditionally normally distributed, the vector of
observations is not jointly normal. Therefore, conditioned on an
initial observation, the joint density function can be written as
![]() |
(6.4) |
Using this result, and ignoring a constant factor, the
log-likelihood function
for a
sample
size is
![]() |
![]() |
![]() |
(6.5) |
The first order conditions to obtain the maximum likelihood estimator
are:
More generally, the partial derivation of is:
![]() |
The ML estimators
, under the usual assumptions,
are asymptotically normal
The elements of the Hessian matrix are:
![]() |
![]() |
![]() |
The information matrix is simply the negative expectation of the Hessian average over all observations, that is to say,
Taking into account (6.3), the conditional
expectations of the last terms is 1. Hence, to calculate the
unconditional expectation of the Hessian matrix and, therefore,
the information matrix, we approximate it by the average over all
the conditional expectations. Then,
is
consistently estimated by
![]() |
![]() |
![]() |
(6.9) |
In practice, the maximum likelihood estimator is computed by
numerical methods and, in particular gradient methods are
preferred for their simplicity. They are iterative methods and at
each step, we increase the likelihood by searching a step forward
along the gradient direction. It is therefore desirable to
construct the following iteration scheme, computing
from
by
![]() |
(6.11) |
![]() |
In figure 6.6 we can see, as the asymptotic theory states, that both sampling distributions approach a bivariate normal density with a small correlation between them.
|
In this example, we see that the estimated parameters agree with the theoretical values and they have a very high t-ratio. In the third component of the list we have the likelihood and in the fourth component we have the estimated volatility for the model. For example, we can plot the time series and add two lines representing twice the squared root of the estimated volatility around the mean value of the time series, as you can see in figure 6.7,