14.3 Simulation of Risk Processes


14.3.1 Catastrophic Losses

In this section we apply some of the models described earlier to the PCS dataset. The Property Claim Services dataset covers losses resulting from natural catastrophic events in USA that occurred between $ 1990$ and $ 1999$. It is adjusted for inflation using the Consumer Price Index provided by the U.S. Department of Labor. See Chapters 4 and 13 where this dataset was analyzed in the context of CAT bonds and loss distributions, respectively. Note, that the same raw catastrophe data, however, adjusted using the discount window borrowing rate that refers to the simple interest rate at which depository institutions borrow from the Federal Reserve Bank of New York was analyzed by Burnecki, Härdle, and Weron (2004).

Figure: Left panel: The empirical mean excess function $ \hat{e}_n (x)$ for the PCS waiting times. Right panel: Shapes of the mean excess function $ e(x)$ for the log-normal (solid green line), Burr (dashed blue line), and exponential (dotted red line) distributions.
\includegraphics[width=.7\defpicwidth]{STFrisk02a.ps} \includegraphics[width=.7\defpicwidth]{STFrisk02b.ps}

Now, we study the claim arrival process and the distribution of waiting times. As suggested in Chapter 13 we first look for the appropriate shape of the approximating distribution. To this end we plot the empirical mean excess function for the waiting time data (given in years), see Figure 14.2. The initially decreasing, later increasing pattern suggests the log-normal or Burr distribution as most adequate for modeling. The empirical distribution seems, however, to have lighter tails than the two: $ e(x)$ does not increase for very large $ x$. The overall impression might be of a highly volatile but constant function, like that for the exponential distribution. Hence, we fit the log-normal, Burr, and exponential distributions using the $ A^2$ minimization scheme and check the goodness-of-fit with test statistics. In terms of the values of the test statistics the Burr distribution seems to give the best fit. However, it does not pass any of the tests even at the very low level of 0.5% (see Chapter 13 for test definitions). The only distribution that passes any of the four applied tests, although at a very low level, is the log-normal law with parameters $ \mu=-3.91$ and $ \sigma=0.9051$, see Table 14.1. Thus, if we wanted to model the claim arrival process by a renewal process then the log-normal distribution would be the best to describe the waiting times.


Table 14.1: Parameter estimates obtained via the $ A^2$ minimization scheme and test statistics for the PCS waiting times. The corresponding $ p$-values based on 1000 simulated samples are given in parentheses.
Distributions: log-normal Burr exponential
Parameters: $ \mu$=$ -$3.91 $ \alpha $=1.3051 $ \beta$=33.187
  $ \sigma$=0.9051 $ \lambda $= $ 1.6\cdot 10^{-3}$  
  $ \tau $=1.7448  
Tests:
$ D$ 0.0589 0.0492 0.1193
($ <$0.005) ($ <$0.005) ($ <$0.005)
$ V$ 0.0973 0.0938 0.1969
($ <$0.005) ($ <$0.005) ($ <$0.005)
$ W^2$ 0.1281 0.1120 0.9130
(0.013) ($ <$0.005) ($ <$0.005)
$ A^2$ 1.3681 0.8690 5.8998
($ <$0.005) ($ <$0.005) ($ <$0.005)

22882 STFrisk03.xpl

If, on the other hand, we wanted to model the claim arrival process by a HPP then the studies of the quarterly numbers of losses would lead us to the conclusion that the best HPP is given by the annual intensity $ \lambda_1=34.2$. This value is obtained by taking the mean of the quarterly numbers of losses and multiplying it by four. Note, that the value of the intensity is significantly different from the parameter $ \beta=32.427$ of the calibrated exponential distribution, see Table 14.1. This, together with a very bad fit of the exponential law to the waiting times, indicates that the HPP is not a good model for the claim arrival process.

Figure 14.3: Left panel: The quarterly number of losses for the PCS data. Right panel: Periodogram of the PCS quarterly number of losses. A distinct peak is visible at frequency $ \omega =0.25$ implying a period of $ 1/\omega =4$ quarters, i.e. one year.
\includegraphics[width=.7\defpicwidth]{STFrisk04a.ps} \includegraphics[width=.7\defpicwidth]{STFrisk04b.ps}

Further analysis of the data reveals its periodicity. The time series of the quarterly number of losses does not exhibit any trends but an annual seasonality can be very well observed using the periodogram, see Figure 14.3. This suggests that calibrating a NHPP with a sinusoidal rate function would give a good model. We estimate the parameters by fitting the cumulative intensity function, i.e. the mean value function $ \mathop {\textrm {E}} (N_t)$, to the accumulated number of PCS losses. The least squares algorithm yields the formula $ \lambda_2(t) = 35.32 + 2.32 \cdot 2 \pi \cdot \sin\{2 \pi (t - 0.20)\}$. This choice of $ \lambda(t)$ gives a reasonably good fit, see also Chapter 4.

Figure 14.4: The PCS data simulation results for a NHPP with Burr claim sizes (left panel), a NHPP with log-normal claim sizes (right panel), and a NHPP with claims generated from the edf (bottom panel). The dotted lines are the sample 0.001, 0.01, 0.05, 0.25, 0.50, 0.75, 0.95, 0.99, 0.999-quantile lines based on 3000 trajectories of the risk process.
\includegraphics[width=.7\defpicwidth]{STFrisk05a.ps} \includegraphics[width=.7\defpicwidth]{STFrisk05b.ps} \includegraphics[width=.7\defpicwidth]{STFrisk05c.ps}

To study the evolution of the risk process we simulate sample trajectories. We consider a hypothetical scenario where the insurance company insures losses resulting from catastrophic events in the United States. The company's initial capital is assumed to be $ u=100$ billion USD and the relative safety loading used is $ \theta=0.5$. We choose different models of the risk process whose application is most justified by the statistical results described above. The results are presented in Figure 14.4. In all subplots the thick solid blue line is the ``real'' risk process, i.e. a trajectory constructed from the historical arrival times and values of the losses. The different shapes of the ``real'' risk process in the subplots are due to the different forms of the premium function $ c(t)$. Recall, that the function has to be chosen accordingly to the type of the claim arrival process. The dashed red line is a sample trajectory. The dotted lines are the sample 0.001, 0.01, 0.05, 0.25, 0.50, 0.75, 0.95, 0.99, 0.999-quantile lines based on 3000 trajectories of the risk process. The function $ \hat{x}_p(t)$ is called a sample $ p$-quantile line if for each $ t\in [t_0,T]$, $ \hat{x}_p(t)$ is the sample $ p$-quantile, i.e. if it satisfies $ F_n(x_p -) \le p \le F_n(x_p)$, where $ F_n$ is the edf. Quantile lines are a very helpful tool in the analysis of stochastic processes. For example, they can provide a simple justification of the stationarity (or the lack of it) of a process, see Janicki and Weron (1994). In Figure 14.4 they visualize the evolution of the density of the risk process. The periodic pattern is due to the sinusoidal intensity function $ \lambda_2(t)$. We also note that we assumed in the simulations that if the capital of the insurance company drops bellow zero, the company goes bankrupt, so the capital is set to zero and remains at this level hereafter. This is in agreement with Chapter 15.

The claim severity distribution of the PCS dataset was studied in Chapter 13. The Burr distribution with parameters $ \alpha= 0.4801$, $ \lambda= 3.9495\cdot 10^{16}$, and $ \tau= 2.1524$ yielded the best fit. Unfortunately, such a choice of the parameters leads to an undesired feature of the claim size distribution - very heavy tails of order $ x^{-\alpha\tau}\approx x^{-1.03}$. Although the expected value exists, the sample mean is, in general, significantly below the theoretical value. As a consequence, the premium function $ c(t)$ cannot include the factor $ \mu=\mathrm{E}(X_k)$ or the risk process trajectories will exhibit a highly positive drift. To cope with this problem, in the simulations we substitute the original factor $ \mu$ with $ \tilde\mu$ equal to the empirical mean of the simulated claims for all trajectories. Despite this change the trajectories possess a positive drift due to the large value of the relative safety loading $ \theta$. They are also highly volatile leading to a large number of ruins - the 0.05-quantile line drops to zero after five years, see the left panel in Figure 14.4. It seems that the Burr distribution overestimates the PCS losses.

In our second attempt we simulate the NHPP with log-normal claims with $ \mu= 18.3806$ and $ \sigma= 1.1052$, as the log-normal law was found in Chapter 13 to yield a relatively good fit to the data. The results, shown in the right panel of Figure 14.4, are not satisfactory. This time the analytical distribution largely underestimates the loss data. The ``real'' risk process is well outside the 0.001-quantile line. This leads us to the conclusion that none of the analytical loss distributions describes the data well enough. We either overestimate risk using the Burr distribution or underestimate it with the log-normal law. Hence, in our next attempt we simulate the NHPP with claims generated from the edf, see the bottom panel in Figure 14.4. The factor $ \mu$ in the premium function $ c(t)$ is set to the empirical mean. This time the ``real'' risk process lies close to the median and does not cross the lower and upper quantile lines. This approach seems to give the best results. However, we do have to remember that it has its shortcomings. For example, the model is tailor-made for the dataset at hand but is not universal. As the dataset will be expanded by including new losses the model may change substantially. An analytic model would, in general, be less susceptible to such modifications. Hence, it might be more optimal to use the Burr distribution after all.


14.3.2 Danish Fire Losses

We conduct empirical studies for Danish fire losses recorded by Copenhagen Re. The data concerns major Danish fire losses in Danish Krone (DKK), occurred between 1980 and 1990 and adjusted for inflation. Only losses of profits connected with the fires are taken into consideration, see Chapter 13 and Burnecki and Weron (2004), where this dataset was also analyzed.

We start the analysis with a HPP with a constant intensity $ \lambda_3$. Studies of the quarterly numbers of losses and the inter-occurrence times of the fires lead us to the conclusion that the HPP with the annual intensity $ \lambda_3=57.72$ gives the best fit. However, as we can see in the right panel of Figure 14.5, the fit is not very good suggesting that the HPP is too simplistic and forcing us to consider the NHPP. In fact, a renewal process would also give unsatisfactory results as the data reveals a clear increasing trend in the number of quarterly losses, see the left panel in Figure 14.5. We tested different exponential and polynomial functional forms, but a simple linear intensity function $ \lambda_4(s) = c+ds$ gives the best fit. Applying the least squares procedure we arrive at the following values of the parameters: $ c=13.97$ and $ d=7.57$. Processes with both choices of the intensity function, $ \lambda_3$ and $ \lambda_4(s)$, are illustrated in the right panel of Figure 14.5, where the accumulated number of fire losses and mean value functions for all 11 years of data are depicted.

Figure 14.5: Left panel: The quarterly number of losses for the Danish fire data. Right panel: The aggregate quarterly number of losses of the Danish fire data (dashed blue line) together with the mean value function $ \mathop {\textrm {E}} (N_t)$ of the calibrated HPP (solid black line) and the NHPP (dotted red line). Clearly the latter model gives a better fit to the empirical data.
\includegraphics[width=.7\defpicwidth]{STFrisk06a.ps} \includegraphics[width=.7\defpicwidth]{STFrisk06b.ps}

Figure 14.6: The Danish fire data simulation results for a NHPP with log-normal claim sizes (left panel), a NHPP with Burr claim sizes (right panel), and a NHPP with claims generated from the edf (bottom panel). The dotted lines are the sample 0.001, 0.01, 0.05, 0.25, 0.50, 0.75, 0.95, 0.99, 0.999-quantile lines based on 3000 trajectories of the risk process.
\includegraphics[width=.7\defpicwidth]{STFrisk07a.ps} \includegraphics[width=.7\defpicwidth]{STFrisk07b.ps} \includegraphics[width=.7\defpicwidth]{STFrisk07c.ps}

After describing the claim arrival process we have to find an appropriate model for the loss amounts. In Chapter 13 a number of distributions were fitted to loss sizes. The log-normal distribution with parameters $ \mu=12.6645$ and $ \sigma=1.3981$ produced the best results. The Burr distribution with $ \alpha=0.8804$, $ \lambda=8.4202\cdot 10^6$, and $ \tau=1.2749$ overestimated the tails of the empirical distribution, nevertheless it gave the next best fit.

The simulation results are presented in Figure 14.6. We consider a hypothetical scenario where the insurance company insures losses resulting from fire damage. The company's initial capital is assumed to be $ u = 400$ million DKK and the relative safety loading used is $ \theta=0.5$. We choose two models of the risk process whose application is most justified by the statistical results described above: a NHPP with log-normal claim sizes and a NHPP with Burr claim sizes. For comparison we also present the results of a model incorporating the empirical distribution function. Recall, that in this model the factor $ \mu$ in the premium function $ c(t)$ is set to the empirical mean.

In all panels of Figure 14.6 the thick solid blue line is the ``real'' risk process, i.e. a trajectory constructed from the historical arrival times and values of the losses. The different shapes of the ``real'' risk process in the subplots are due to the different forms of the premium function $ c(t)$ which has to be chosen accordingly to the type of the claim arrival process. The dashed red line is a sample trajectory. The dotted lines are the sample 0.001, 0.01, 0.05, 0.25, 0.50, 0.75, 0.95, 0.99, 0.999-quantile lines based on 3000 trajectories of the risk process. Similarly as in PCS data case, we assume that if the capital of the insurance company drops bellow zero, the company goes bankrupt, so the capital is set to zero and remains at this level hereafter.

Clearly, if claim severities are Burr distributed then extreme events are more probable to happen than in the log-normal case, for which the historical trajectory falls outside the 0.001-quantile line. The overall picture is, in fact, similar to the one obtained for the PCS data. We either overestimate risk using the Burr distribution or underestimate it with the log-normal law. The empirical approach yields the ``real'' risk process which lies close to the median and does not cross the very low or very high quantile lines. However, as stated previously, the empirical approach has its shortcomings. Since this time we only slightly undervalue risk with the log-normal law it might be advisable to use it for further modeling.