13.3 Analytical Methods

It is often desirable to find an explicit analytical expression for a loss distribution. This is particularly the case if the claim statistics are too sparse to use the empirical approach. It should be stressed, however, that many standard models in statistics - like the Gaussian distribution - are unsuitable for fitting the claim size distribution. The main reason for this is the strongly skewed nature of loss distributions. The log-normal, Pareto, Burr, Weibull, and gamma distributions are typical candidates for claim size distributions to be considered in applications.


13.3.1 Log-normal Distribution

Consider a random variable $ X$ which has the normal distribution with density

$\displaystyle f_N(x) = \frac{1}{\sqrt{2\pi} \sigma} \exp\left\{ -\frac{1}{2} \frac{(x-\mu)^2}{\sigma^2}\right\}, \qquad -\infty<x<\infty.$ (13.2)

Let $ Y=e^X$ so that $ X=\log Y$. Then the probability density function of $ Y$ is given by:

$\displaystyle f(y) = f_N(\log y)\frac{1}{y} = \frac{1}{\sqrt{2\pi} \sigma y} \exp\left\{ -\frac{1}{2} \frac{(\log y-\mu)^2}{\sigma^2}\right\}, \qquad y>0,$ (13.3)

where $ \sigma>0$ is the scale and $ -\infty<\mu<\infty$ is the location parameter. The distribution of $ Y$ is termed log-normal, however, sometimes it is called the Cobb-Douglas law, especially when applied to econometric data. The log-normal cdf is given by:

$\displaystyle F(y) = \Phi\left(\frac{\log y-\mu}{\sigma} \right), \qquad y>0,$ (13.4)

where $ \Phi(\cdot)$ is the standard normal (with mean 0 and variance l) distribution function. The $ k$-th raw moment $ m_k$ of the log-normal variate can be easily derived using results for normal random variables:

$\displaystyle m_k = \mathop{\textrm{E}} \left(Y^k\right) = \mathop{\textrm{E}} \left(e^{kX}\right) = M_X(k) = \exp\left( \mu k + \frac{\sigma^2 k^2}{2} \right),$ (13.5)

where $ M_X(z)$ is the moment generating function of the normal distribution. In particular, the mean and variance are
$\displaystyle \mathop{\textrm{E}}(X)$ $\displaystyle =$ $\displaystyle \exp\left(\mu+\frac{\sigma^2}{2}\right),$ (13.6)
$\displaystyle \mathop{\textrm{Var}}(X)$ $\displaystyle =$ $\displaystyle \left\{\exp\left(\sigma^2\right)-1\right\}\exp\left(2\mu+\sigma^2\right),$ (13.7)

respectively. For both standard parameter estimation techniques the estimators are known in closed form. The method of moments estimators are given by:
$\displaystyle \hat{\mu}$ $\displaystyle =$ $\displaystyle 2\log \left( \frac{1}{n}\sum_{i=1}^n x_i \right) - \frac12 \log \left( \frac{1}{n}\sum_{i=1}^n x_i^2 \right),$ (13.8)
$\displaystyle \hat{\sigma}^2$ $\displaystyle =$ $\displaystyle \log \left( \frac{1}{n}\sum_{i=1}^n x_i^2 \right) - 2\log \left( \frac{1}{n}\sum_{i=1}^n x_i \right),$ (13.9)

while the maximum likelihood estimators by:
$\displaystyle \hat{\mu}$ $\displaystyle =$ $\displaystyle \frac{1}{n}\sum_{i=1}^n \log(x_i),$ (13.10)
$\displaystyle \hat{\sigma}^2$ $\displaystyle =$ $\displaystyle \frac{1}{n}\sum_{i=1}^n \left\{\log(x_i)-\hat{\mu}\right\}^2.$ (13.11)

Finally, the generation of a log-normal variate is straightforward. We simply have to take the exponent of a normal variate.

The log-normal distribution is very useful in modeling of claim sizes. It is right-skewed, has a thick tail and fits many situations well. For small $ \sigma$ it resembles a normal distribution (see the left panel in Figure 13.2) although this is not always desirable. It is infinitely divisible and closed under scale and power transformations. However, it also suffers from some drawbacks. Most notably, the Laplace transform does not have a closed form representation and the moment generating function does not exist.

Figure 13.2: Left panel: Log-normal probability density functions (pdfs) with parameters $ \mu =2$ and $ \sigma =1$ (black solid line), $ \mu =2$ and $ \sigma =0.1$ (red dotted line), and $ \mu =0.5$ and $ \sigma =2$ (blue dashed line). Right panel: Exponential pdfs with parameter $ \beta =0.5$ (black solid line), $ \beta =1$ (red dotted line), and $ \beta =5$ (blue dashed line).
\includegraphics[width=.7\defpicwidth]{STFloss02a.ps} \includegraphics[width=.7\defpicwidth]{STFloss02b.ps}


13.3.2 Exponential Distribution

Consider the random variable with the following density and distribution functions, respectively:

$\displaystyle f(x)$ $\displaystyle =$ $\displaystyle \beta e^{-\beta x}, \qquad x>0,$ (13.12)
$\displaystyle F(x)$ $\displaystyle =$ $\displaystyle 1 - e^{-\beta x}, \qquad x>0.$ (13.13)

This distribution is termed an exponential distribution with parameter (or intensity) $ \beta>0$. The Laplace transform of (13.12) is

$\displaystyle L(t) \stackrel{\mathrm{def}}{=} \int_0^\infty e^{-tx} f(x) dx = \frac{\beta}{\beta + t}, \qquad t>-\beta,$ (13.14)

yielding the general formula for the $ k$-th raw moment

$\displaystyle m_k \stackrel{\mathrm{def}}{=} (-1)^k \frac{\partial^k L(t)}{\partial t^k} \Big\vert _{t=0} = \frac{k!}{\beta^k}.$ (13.15)

The mean and variance are thus $ \beta^{-1}$ and $ \beta^{-2}$, respectively. The maximum likelihood estimator (equal to the method of moments estimator) for $ \beta$ is given by:

$\displaystyle \hat\beta = \frac {1}{\hat{m}_1},$ (13.16)

where

$\displaystyle \hat{m}_k = \frac{1}{n}\sum_{i=1}^n x_i^k,$ (13.17)

is the sample $ k$-th raw moment.

To generate an exponential random variable $ X$ with intensity $ \beta$ we can use the inverse transform method (Ross; 2002; L'Ecuyer; 2004). The method consists of taking a random number $ U$ distributed uniformly on the interval $ (0,1)$ and setting $ X = F^{-1}(U)$, where $ F^{-1}(x) = -\frac{1}{\beta}\log(1-x)$ is the inverse of the exponential cdf (13.13). In fact we can set $ X = -\frac{1}{\beta}\log U$ since $ 1-U$ has the same distribution as $ U$.

The exponential distribution has many interesting features. For example, it has the memoryless property, i.e. $ \textrm{P}(X>x+y\vert X>y) = \textrm{P}(X>x)$. It also arises as the inter-occurrence times of the events in a Poisson process, see Chapter 14. The $ n$-th root of the Laplace transform (13.14) is

$\displaystyle L(t) = \left(\frac{\beta}{\beta + t}\right)^\frac{1}{n},$ (13.18)

which is the Laplace transform of a gamma variate (see Section 13.3.6). Thus the exponential distribution is infinitely divisible.

The exponential distribution is often used in developing models of insurance risks. This usefulness stems in a large part from its many and varied tractable mathematical properties. However, a disadvantage of the exponential distribution is that its density is monotone decreasing (see the right panel in Figure 13.2), a situation which may not be appropriate in some practical situations.


13.3.3 Pareto Distribution

Suppose that a variate $ X$ has (conditional on $ \beta$) an exponential distribution with mean $ \beta^{-1}$. Further, suppose that $ \beta$ itself has a gamma distribution (see Section 13.3.6). The unconditional distribution of $ X$ is a mixture and is called the Pareto distribution. Moreover, it can be shown that if $ X$ is an exponential random variable and $ Y$ is a gamma random variable, then $ X/Y$ is a Pareto random variable.

The density and distribution functions of a Pareto variate are given by:

$\displaystyle f(x)$ $\displaystyle =$ $\displaystyle \frac{\alpha\lambda^\alpha}{(\lambda+x)^{\alpha+1}}, \qquad x>0,$ (13.19)
$\displaystyle F(x)$ $\displaystyle =$ $\displaystyle 1 - \left(\frac{\lambda}{\lambda+x}\right)^\alpha, \qquad x>0,$ (13.20)

respectively. Clearly, the shape parameter $ \alpha $ and the scale parameter $ \lambda $ are both positive. The $ k$-th raw moment:

$\displaystyle m_k = \lambda^k k! \frac{\Gamma(\alpha-k)}{\Gamma(\alpha)},$ (13.21)

exists only for $ k<\alpha$. In the above formula

$\displaystyle \Gamma(a)\stackrel{\mathrm{def}}{=}\int_{0}^{\infty} y^{a-1}e^{-y}dy,$ (13.22)

is the standard gamma function. The mean and variance are thus:
$\displaystyle \mathop{\textrm{E}}(X)$ $\displaystyle =$ $\displaystyle \frac{\lambda}{\alpha-1},$ (13.23)
$\displaystyle \textrm{Var}(X)$ $\displaystyle =$ $\displaystyle \frac{\alpha\lambda^2}{(\alpha-1)^2(\alpha-2)},$ (13.24)

respectively. Note, that the mean exists only for $ \alpha >1$ and the variance only for $ \alpha>2$. Hence, the Pareto distribution has very thick (or heavy) tails, see Figure 13.3. The method of moments estimators are given by:
$\displaystyle \hat{\alpha}$ $\displaystyle =$ $\displaystyle 2\frac{\hat{m}_2 - \hat{m}_1^2}{\hat{m}_2-2\hat{m}_1^2},$ (13.25)
$\displaystyle \hat{\lambda}$ $\displaystyle =$ $\displaystyle \frac{\hat{m}_1 \hat{m}_2}{\hat{m}_2-2\hat{m}_1^2},$ (13.26)

where, as before, $ \hat{m}_k$ is the sample $ k$-th raw moment (13.17). Note, that the estimators are well defined only when $ \hat{m}_2-2\hat{m}_1^2>0$. Unfortunately, there are no closed form expressions for the maximum likelihood estimators and they can only be evaluated numerically.

Figure 13.3: Left panel: Pareto pdfs with parameters $ \alpha =0.5$ and $ \lambda =2$ (black solid line), $ \alpha = 2$ and $ \lambda =0.5$ (red dotted line), and $ \alpha = 2$ and $ \lambda =1$ (blue dashed line). Right panel: The same Pareto densities on a double logarithmic plot. The thick power-law tails of the Pareto distribution are clearly visible.
\includegraphics[width=.7\defpicwidth]{STFloss03a.ps} \includegraphics[width=.7\defpicwidth]{STFloss03b.ps}

Like for many other distributions the simulation of a Pareto variate $ X$ can be conducted via the inverse transform method. The inverse of the cdf (13.20) has a simple analytical form $ F^{-1}(x) = \lambda\left\{ (1-x)^{-1/\alpha} - 1\right\}$. Hence, we can set $ X =
\lambda\left( U^{-1/\alpha} - 1\right)$, where $ U$ is distributed uniformly on the unit interval. We have to be cautious, however, when $ \alpha $ is larger but very close to one. The theoretical mean exists, but the right tail is very heavy. The sample mean will, in general, be significantly lower than $ \mathop{\textrm{E}}(X)$.

The Pareto law is very useful in modeling claim sizes in insurance, due in large part to its extremely thick tail. Its main drawback lies in its lack of mathematical tractability in some situations. Like for the log-normal distribution, the Laplace transform does not have a closed form representation and the moment generating function does not exist. Moreover, like the exponential pdf the Pareto density (13.19) is monotone decreasing, which may not be adequate in some practical situations.


13.3.4 Burr Distribution

Experience has shown that the Pareto formula is often an appropriate model for the claim size distribution, particularly where exceptionally large claims may occur. However, there is sometimes a need to find heavy tailed distributions which offer greater flexibility than the Pareto law, including a non-monotone pdf. Such flexibility is provided by the Burr distribution and its additional shape parameter $ \tau>0$. If $ Y$ has the Pareto distribution, then the distribution of $ X=Y^{1/\tau}$ is known as the Burr distribution, see the left panel in Figure 13.4. Its density and distribution functions are given by:

$\displaystyle f(x)$ $\displaystyle =$ $\displaystyle \tau\alpha\lambda^\alpha \frac{x^{\tau-1}}{ (\lambda+x^{\tau})^{\alpha+1}}, \qquad x>0,$ (13.27)
$\displaystyle F(x)$ $\displaystyle =$ $\displaystyle 1 - \left(\frac{\lambda}{\lambda + x^{\tau}}\right)^\alpha, \qquad x>0,$ (13.28)

respectively. The $ k$-th raw moment

$\displaystyle m_k = \frac{1}{\Gamma(\alpha)}\lambda^{k/\tau} \Gamma\left(1+\frac{k}{\tau}\right) \Gamma\left(\alpha - \frac{k}{\tau}\right),$ (13.29)

exists only for $ k<\tau\alpha$. Naturally, the Laplace transform does not exist in a closed form and the distribution has no moment generating function as it was the case with the Pareto distribution.

The maximum likelihood and method of moments estimators for the Burr distribution can only be evaluated numerically. A Burr variate $ X$ can be generated using the inverse transform method. The inverse of the cdf (13.28) has a simple analytical form $ F^{-1}(x) = \left[\lambda\left\{ (1-x)^{-1/\alpha} - 1\right\}\right]^{1/\tau}$. Hence, we can set $ X = \left\{\lambda\left( U^{-1/\alpha} - 1\right)\right\}^{1/\tau}$, where $ U$ is distributed uniformly on the unit interval. Like in the Pareto case, we have to be cautious when $ \tau\alpha$ is larger but very close to one. The theoretical mean exists, but the right tail is very heavy. The sample mean will, in general, be significantly lower than $ \mathop{\textrm{E}}(X)$.

Figure 13.4: Left panel: Burr pdfs with parameters $ \alpha =0.5$, $ \lambda =2$ and $ \tau =1.5$ (black solid line), $ \alpha =0.5$, $ \lambda =0.5$ and $ \tau =5$ (red dotted line), and $ \alpha = 2$, $ \lambda =1$ and $ \tau =0.5$ (blue dashed line). Right panel: Weibull pdfs with parameters $ \beta =1$ and $ \tau =0.5$ (black solid line), $ \beta =1$ and $ \tau =2$ (red dotted line), and $ \beta =0.01$ and $ \tau =6$ (blue dashed line).
\includegraphics[width=.7\defpicwidth]{STFloss04a.ps} \includegraphics[width=.7\defpicwidth]{STFloss04b.ps}


13.3.5 Weibull Distribution

If $ V$ is an exponential variate, then the distribution of $ X=V^{1/\tau}$, $ \tau>0$, is called the Weibull (or Frechet) distribution. Its density and distribution functions are given by:

$\displaystyle f(x)$ $\displaystyle =$ $\displaystyle \tau\beta x^{\tau-1} e^{-\beta x^\tau}, \qquad x>0,$ (13.30)
$\displaystyle F(x)$ $\displaystyle =$ $\displaystyle 1 - e^{-\beta x^\tau}, \qquad x>0,$ (13.31)

respectively. The Weibull distribution is roughly symmetrical for the shape parameter $ \tau \approx 3.6$. When $ \tau $ is smaller the distribution is right-skewed, when $ \tau $ is larger it is left-skewed, see the right panel in Figure 13.4. The $ k$-th raw moment can be shown to be

$\displaystyle m_k = \beta^{-k/\tau} \Gamma\left(1+\frac{k}{\tau}\right).$ (13.32)

Like for the Burr distribution, the maximum likelihood and method of moments estimators can only be evaluated numerically. Similarly, Weibull variates can be generated using the inverse transform method.


13.3.6 Gamma Distribution

The probability law with density and distribution functions given by:

$\displaystyle f(x)$ $\displaystyle =$ $\displaystyle \beta(\beta x)^{\alpha-1} \frac{e^{-\beta x}}{\Gamma(\alpha)}, \qquad x>0,$ (13.33)
$\displaystyle F(x)$ $\displaystyle =$ $\displaystyle \int_0^x \beta(\beta s)^{\alpha-1} \frac{e^{-\beta s}}{\Gamma(\alpha)} ds, \qquad x>0,$ (13.34)

where $ \alpha $ and $ \beta$ are non-negative, is known as a gamma (or a Pearson's Type III) distribution, see the left panel in Figure 13.5. Moreover, for $ \beta =1$ the integral in (13.34):

$\displaystyle \Gamma(\alpha,x) \stackrel{\mathrm{def}}{=} \frac{1}{\Gamma(\alpha)}\int_{0}^{x} s^{\alpha-1}e^{-s}ds,$ (13.35)

is called the incomplete gamma function. If the shape parameter $ \alpha =1$, the exponential distribution results. If $ \alpha $ is a positive integer, the distribution is termed an Erlang law. If $ \beta = \frac{1}{2}$ and $ \alpha = \frac{\nu}{2}$ then it is termed a chi-squared ($ \chi^2$) distribution with $ \nu$ degrees of freedom. Moreover, a mixed Poisson distribution with gamma mixing distribution is negative binomial, see Chapter 18.

Figure 13.5: Left panel: Gamma pdfs with parameters $ \alpha =1$ and $ \beta =2$ (black solid line), $ \alpha = 2$ and $ \beta =1$ (red dotted line), and $ \alpha =3$ and $ \beta =0.5$ (blue dashed line). Right panel: Densities of two exponential distributions with parameters $ \beta _1=0.5$ (red dotted line) and $ \beta _2=0.1$ (blue dashed line) and of their mixture with the mixing parameter $ a=0.5$ (black solid line).
\includegraphics[width=.7\defpicwidth]{STFloss05a.ps} \includegraphics[width=.7\defpicwidth]{STFloss05b.ps}

The Laplace transform of the gamma distribution is given by:

$\displaystyle L(t) = \left(\frac{\beta}{\beta+t} \right)^\alpha, \qquad t>-\beta.$ (13.36)

The $ k$-th raw moment can be easily derived from the Laplace transform:

$\displaystyle m_k = \frac{\Gamma(\alpha+k)}{\Gamma(\alpha)\beta^k} .$ (13.37)

Hence, the mean and variance are
$\displaystyle \mathop{\textrm{E}}(X)$ $\displaystyle =$ $\displaystyle \frac{\alpha}{\beta},$ (13.38)
$\displaystyle \textrm{Var}(X)$ $\displaystyle =$ $\displaystyle \frac{\alpha}{\beta^2}.$ (13.39)

Finally, the method of moments estimators for the gamma distribution parameters have closed form expressions:
$\displaystyle \hat{\alpha}$ $\displaystyle =$ $\displaystyle \frac{\hat{m}_1^2}{\hat{m}_2-\hat{m}_1^2},$ (13.40)
$\displaystyle \hat{\beta}$ $\displaystyle =$ $\displaystyle \frac{\hat{m}_1}{\hat{m}_2-\hat{m}_1^2},$ (13.41)

but maximum likelihood estimators can only be evaluated numerically. Simulation of gamma variates is not as straightforward as for the distributions presented above. For $ \alpha <1$ a simple but slow algorithm due to Jöhnk (1964) can be used, while for $ \alpha >1$ the rejection method is more optimal (Devroye; 1986; Bratley, Fox, and Schrage; 1987).

The gamma distribution is closed under convolution, i.e. a sum of independent gamma variates with the same parameter $ \beta$ is again gamma distributed with this $ \beta$. Hence, it is infinitely divisible. Moreover, it is right-skewed and approaches a normal distribution in the limit as $ \alpha $ goes to infinity.

The gamma law is one of the most important distributions for modeling because it has very tractable mathematical properties. As we have seen above it is also very useful in creating other distributions, but by itself is rarely a reasonable model for insurance claim sizes.


13.3.7 Mixture of Exponential Distributions

Let $ a_1,a_2,\ldots,a_n$ denote a series of non-negative weights satisfying $ \sum_{i=1}^n a_i = 1$. Let $ F_1(x),F_2(x),\ldots,F_n(x)$ denote an arbitrary sequence of exponential distribution functions given by the parameters $ \beta_1,\beta_2,\ldots,\beta_n$, respectively. Then, the distribution function:

$\displaystyle F(x)=\sum_{i=1}^n a_i F_i(x)=\sum_{i=1}^n a_i \left\{1-\exp(-\beta_i x)\right\},$ (13.42)

is called a mixture of $ n$ exponential distributions (exponentials). The density function of the constructed distribution is

$\displaystyle f(x)=\sum_{i=1}^n a_i f_i(x)=\sum_{i=1}^n a_i \beta_i \exp(-\beta_i x),$ (13.43)

where $ f_1(x),f_2(x),\ldots,f_n(x)$ denote the density functions of the input exponential distributions. Note, that the mixing procedure can be applied to arbitrary distributions. Using the technique of mixing, one can construct a wide class of distributions. The most commonly used in the applications is a mixture of two exponentials, see Chapter 15. In the right panel of Figure 13.5 a pdf of a mixture of two exponentials is plotted together with the pdfs of the mixing laws.

The Laplace transform of (13.43) is

$\displaystyle L(t) = \sum_{i=1}^n a_i\frac{\beta_i}{\beta_i + t}, \qquad t>-\min_{i=1\ldots n} \{\beta_i\},$ (13.44)

yielding the general formula for the $ k$-th raw moment

$\displaystyle m_k = \sum_{i=1}^n a_i\frac{k!}{\beta_i^k}.$ (13.45)

The mean is thus $ \sum_{i=1}^n a_i\beta_i^{-1}$. The maximum likelihood and method of moments estimators for the mixture of $ n$ ($ n\geq 2$) exponential distributions can only be evaluated numerically.

Simulation of variates defined by (13.42) can be performed using the composition approach (Ross; 2002). First generate a random variable $ I$, equal to $ i$ with probability $ a_i$, $ i=1,...,n$. Then simulate an exponential variate with intensity $ \beta_I$. Note, that the method is general in the sense that it can be used for any set of distributions $ F_i$'s.