13.5 Fitting GP Distributions to the Upper Tail

The GP approach concerns a local, parametric modeling and estimation of the underlying distribution in the upper tail. By using the estimation procedures in Section 13.6, one may fit a GP distribution in two steps.

Firstly, one estimates a GP distribution, say $ W_{\widetilde\gamma,t,\widetilde\sigma}$, within the GP submodel $ \{W_{\gamma,t,\sigma}: \gamma \in \mathbb{R}, \sigma>0\}$, based on the exceedances $ y_1,\dots,y_k$ over a selected threshold $ t$. Notice that the location parameter is equal to the truncation point $ t$, which is also the left endpoint of the estimated GP distribution. The estimated GP df, density, quantile function and mean excess function can be fitted to the empirical df, density, quantile function and mean excess function based on the exceedances $ y_j$.

Secondly, a fit to the original data $ x_1, \dots,x_n$ is achieved by selecting location and scale parameters $ \widehat\mu$ and $ \widehat\sigma$ such that

$\displaystyle W_{\widehat\gamma,\widehat\mu,\widehat\sigma}^{[t]}=W_{\widetilde\gamma,t,\widetilde\sigma}
$

and

$\displaystyle W_{\widehat\gamma,\widehat\mu,\widehat\sigma}(t)= {n-k\over n}\,,
$

where $ n$ is the total sample size and $ k=\sum_{i=1}^n I(t < x_i)$ is the number of exceedances above $ t$. One gets a GP distribution with the same shape as the first one, yet the reparametrized distribution possesses the mass $ 1-k/n$ just as the empirical distribution function of the $ x_i$ above the threshold $ t$. As a result, the reparametrized df, density, quantile function and mean excess function can be fitted to the upper part of the empirical df, density, quantile function and mean excess function based on the original data $ x_i$. Explicitly, one gets the estimators
$\displaystyle \widehat\gamma$ $\displaystyle =$ $\displaystyle \widetilde\gamma,$  
$\displaystyle \widehat\sigma$ $\displaystyle =$ $\displaystyle \widetilde\sigma (k/n)^{\widetilde\gamma},$  
$\displaystyle \widehat\mu$ $\displaystyle =$ $\displaystyle t - {\widetilde\sigma - \widehat\sigma \over \widetilde\gamma}.$  

Figure 13.1 exemplifies this procedure. The left-hand plot shows the empirical df (solid line) based on the exceedances above the threshold $ 1.13$ and a fitted GP df (dotted). The plot on the right-hand side shows the empirical df of the original data set and the reparametrized GP df that fits to the upper tail.

Figure 13.1: Fitting a GP df to the exceedance df above $ t=1.13$ (left) and fitting a reparametrized GP df to the upper tail of the empirical df of the complete sample (right).
\includegraphics[scale=0.425]{xtrfig1}

Within the GP1-submodel of Pareto dfs $ W_{1,\alpha,0,\sigma}$ with location parameter $ \mu=0$, we have

$\displaystyle W_{1,\alpha,0,\sigma}^{[t]} = W_{1,\alpha,0,t}.
$

After having estimated the shape parameter $ \alpha$, based on the exceedances above the threshold $ t$, one obtains the required scale parameter $ \sigma$ from the relation

$\displaystyle W_{1,\alpha,0,\sigma}(t) = {n-k \over n},
$

which yields the estimator

$\displaystyle \widehat\sigma = t \left({k \over n}\right)^{1/\widehat\alpha}.
$

In our implementation, one has to select the number $ k$ of exceedances. Then, the threshold $ t=x_{n-k+1:n}$ is utilized.