Having a large collection of distributions to choose from we need to narrow our selection to a single model and a unique parameter estimate. The type of the objective loss distribution can be easily selected by comparing the shapes of the empirical and theoretical mean excess functions. The mean excess function, presented in Section 13.4.1, is based on the idea of conditioning a random variable given that it exceeds a certain level.
Once the distribution class is selected and the parameters are estimated using one of the available methods the goodness-of-fit has to be tested. Probably the most natural approach consists of measuring the distance between the empirical and the fitted analytical distribution function. A group of statistics and tests based on this idea is discussed in Section 13.4.2. However, when using these tests we face the problem of comparing a discontinuous step function with a continuous non-decreasing curve. The two functions will always differ from each other in the vicinity of a step by at least half the size of the step. This problem can be overcome by integrating both distributions once, which leads to the so-called limited expected value function introduced in Section 13.4.3.
For a claim amount random variable , the mean excess function or mean residual life function is the expected payment per claim on a policy with a fixed amount deductible of
, where claims with amounts less than or equal to
are completely ignored:
![]() |
(13.46) |
![]() |
(13.47) |
When considering the shapes of mean excess functions, the exponential distribution plays a central role. It has the memoryless property, meaning that whether the information is given or not, the expected value of
is the same as if one started at
and calculated
. The mean excess function for the exponential distribution is therefore constant. One in fact easily calculates that for this case
for all
.
If the distribution of is heavier-tailed than the exponential distribution we find that the mean excess function ultimately increases, when it is lighter-tailed
ultimately decreases. Hence, the shape of
provides important information on the
sub-exponential or super-exponential nature of the tail of the distribution at hand.
Mean excess functions and first order approximations to the tail for the distributions discussed in Section 13.3 are given by the following formulas:
![]() |
![]() |
![]() |
|
![]() |
![]() |
![]() |
![]() |
![]() |
|
![]() |
![]() |
||
![]() |
![]() |
![]() |
![]() |
![]() |
|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() ![]() |
A statistics measuring the difference between the empirical and the fitted
distribution function, called an edf statistic, is based on the vertical difference between the distributions. This distance is usually measured either by a supremum or a quadratic norm (D'Agostino and Stephens; 1986).
The most well-known supremum statistic:
The second class of measures of discrepancy is given by the Cramér-von Mises family
Suppose that a sample
gives values
. It can be easily shown that, for values
and
related by
, the corresponding vertical differences in the edf diagrams for
and for
are equal. Consequently, edf statistics calculated from the empirical distribution function of the
's compared with the uniform distribution will take the same values as if they were calculated from the empirical distribution function of the
's, compared with
. This leads to the following formulas given in terms of the order statistics
:
![]() |
![]() |
![]() |
(13.51) |
![]() |
![]() |
![]() |
(13.52) |
![]() |
![]() |
![]() |
(13.53) |
![]() |
![]() |
![]() |
(13.54) |
![]() |
![]() |
![]() |
(13.55) |
![]() |
![]() |
![]() |
(13.56) |
![]() |
![]() |
||
![]() |
(13.57) |
The general test of fit is structured as follows. The null hypothesis is that a specific distribution is acceptable, whereas the alternative is that it is not:
![]() |
![]() |
||
![]() |
![]() |
However, we are in a situation where we want to test the hypothesis that the sample has a common distribution function
with unknown
. To employ any of the edf tests we first need to estimate the parameters. It is important to recognize, however, that when the parameters are estimated from the data, the critical values for the tests of the uniform distribution (or equivalently of a fully specified distribution) must be reduced. In other words, if the value of the test statistics
is
, then the
-value is overestimated by
. Here
indicates that the probability is computed under the assumption of a uniformly distributed sample. Hence, if
is small, then the
-value will be even smaller and the hypothesis will be rejected. However, if it is large then we have to obtain a more accurate estimate of the
-value.
Ross (2002) advocates the use of Monte Carlo simulations in this context. First the parameter vector is estimated for a given sample of size , yielding
, and the edf test statistics is calculated assuming that the sample is distributed according to
, returning a value of
. Next, a sample of size
of
-distributed variates is generated. The parameter vector is estimated for this simulated sample, yielding
, and the edf test statistics is calculated assuming that the sample is distributed according to
. The simulation is repeated as many times as required to achieve a certain level of accuracy. The estimate of the
-value is obtained as the proportion of times that the test quantity is at least as large as
.
An alternative solution to the problem of unknown parameters was proposed by Stephens (1978). The half-sample approach consists of using only half the data to estimate the parameters, but then using the entire data set to conduct the test. In this case, the critical values for the uniform distribution can be applied, at least asymptotically. The quadratic edf tests seem to converge fairly rapidly to their asymptotic distributions (D'Agostino and Stephens; 1986). Although, the method is much faster than the Monte Carlo approach it is not invariant - depending on the choice of the half-samples different test values will be obtained and there is no way of increasing the accuracy.
As a side product, the edf tests supply us with a natural technique of estimating the parameter vector . We can simply find such
that minimizes a selected edf statistic. Out of the four presented statistics
is the most powerful when the fitted distribution
departs from the true distribution in the tails (D'Agostino and Stephens; 1986). Since the fit in the tails is of crucial importance in most actuarial
applications
is the recommended statistic for the estimation scheme.
The limited expected value function of a claim size variable
, or of the corresponding cdf
, is defined by
![]() |
(13.59) |
![]() |
(13.60) |
In order to fit the limited expected value function of an analytical distribution to the observed data, the estimate
is first constructed. Thereafter one tries to find a suitable analytical cdf
, such that the corresponding limited expected value function
is as close to the observed
as possible.
The limited expected value function has the following important properties:
A reason why the limited expected value function is a particularly suitable tool for our purposes is that it represents the claim size distribution
in the monetary dimension. For example, we have
if it exists. The cdf
, on the other hand, operates on the probability scale,
i.e. takes values between 0 and
. Therefore, it is usually difficult to see, by looking only at
, how sensitive the price for the
insurance - the premium - is to changes in the values of
, while the limited expected value function shows immediately how different parts of
the claim size cdf contribute to the premium (see Chapter 19 for information on various premium calculation principles). Apart from
curve-fitting purposes, the function
will turn out to be a very useful concept in dealing with deductibles in Chapter 19.
It is also worth mentioning, that there exists a connection between the limited expected value function and the mean excess function:
![]() |
(13.61) |
The limited expected value functions for all distributions considered in this chapter are given by:
![]() |
![]() |
![]() |
|
![]() |
![]() |
![]() |
![]() |
![]() |
From the curve-fitting point of view the use of the limited expected value function has the advantage, compared with the use of the cdfs, that both
the analytical and the corresponding observed function , based on the observed discrete cdf, are continuous and concave, whereas
the observed claim size cdf
is a discontinuous step function. Property (3) implies that the limited expected value function determines the
corresponding cdf uniquely. When the limited expected value functions of two distributions are close to each other, not only are the mean values
of the distributions close to each other, but the whole distributions as well.