13.6 Parametric Estimators for GP Models

We describe the estimators for the GP models which are implemented in XploRe . The output always concerns the reparametrized GP or Pareto (GP1) distributions which were introduced in Section 13.5.


13.6.1 Moment Estimator


{gamma, mu, sigma} = 31879 momentgp (x, k)
applies the moment estimator to the k largest order statistics of the vector x and returns the estimated shape, location and scale parameter of the GP model

The moment estimator (Dekkers, Einmal and de Haan; 1989) for the shape parameter $ \gamma$ in the von Mises parameterization, based on the $ k$ largest values of the sample, is given by

$\displaystyle \widehat\gamma_{n,k} = \widehat\gamma_{1,n,k} + \widehat\gamma_{2,n,k},
$

where

$\displaystyle \widehat\gamma_{1,n,k}=l_{1,k}
$

and

$\displaystyle \widehat\gamma_{2,n,k}=1-{1\over 2}\left(
\ 1-{l_{1,k}^2 \over l_{2,k}}\right)^{-1}
$

with

$\displaystyle l_{j,k}={1\over k-1} \sum_{i=1}^{k-1}
\ \log\left(
\ x_{n-i+1:n} \over x_{n-k+1:n}
\right)^j, \quad j=1,2.
$

The scale parameter $ \widetilde\sigma$ is estimated by fitting a least squares line to the GP QQ-plot under the estimated shape parameter $ \widehat\gamma_{n,k}$, i.e. to the points

$\displaystyle \left(W_{\widehat\gamma_{n,k}}^{-1}\left({i\over k+1}\right), y_{i:k}\right),
\quad i=1,\dots,k,
$

where the $ y_i$ are the $ k$ largest values of the data set $ x_1, \dots, x_n.$ The location parameter is equal to the threshold $ t=x_{n-k+1:n}$, and the transformation to the tail described in Section 13.5 is applied. As an example, we simulate a sample with 500 data points under $ W_{1,1,2}$ and apply the moment estimator, based on all values of the sample:
  x = 1 + 2 * gpdata (1, 500)
  momentgp (x, 500)
Then, XploRe displays in its output window
  Contents of _tmp.gamma
  [1,]   1.0326
  Contents of _tmp.mu
  [1,]   1.0024
  Contents of _tmp.sigma
  [1,]   1.9743


13.6.2 ML Estimator in the GP Model


{gamma, mu, sigma} = 31952 mlegp (x, k)
applies the ML estimator in the GP model to the k largest order statistics of the vector x and returns the estimated shape, location and scale parameter of the GP model

The maximum likelihood estimator in the GP model is numerically evaluated by using an iteration procedure. The moment estimator described in Subsection 13.6.1 serves as an initial value.

The remarks about the ML estimator in the EV model (see Subsection 13.4.2) also apply to this estimator.


13.6.3 Pickands Estimator


{gamma, mu, sigma} = 31996 pickandsgp (x, k)
applies the Pickands estimator to the k largest order statistics of the vector x and returns the estimated shape, location and scale parameter of the GP model

The Pickands estimator (Pickands; 1975) of the shape parameter $ \gamma$ is given by

$\displaystyle \widehat\gamma_n(m)=\log\left(
{x_{n-m+1:n} - x_{n-2m+1:n} \over x_{n-2m+1:n} - x_{n-4m+1:m}}
\right) / \log 2,
$

where $ m=[k/4].$ This construction is similar to the one for the LRS estimator for the EV model described in Subsection 13.4.1. Scale and location parameter are estimated as described in Subsection 13.6.1.


13.6.4 Drees-Pickands Estimator


{gamma, mu, sigma} = 32056 dpgp (x, k)
applies the Drees-Pickands estimator to the k largest order statistics of the vector x and returns the estimated shape, location and scale parameter of the GP model

A refinement of the Pickands estimator was introduced by Drees (1995). It uses a convex combination of Pickands estimates

$\displaystyle \widehat\gamma_{n,k}=\sum_{i=0}^\infty \widehat c_{i,n} \widehat\gamma_n([m2^{-i}]+1)
$

with $ m=[k/4]$ and

$\displaystyle \ \widehat c_{i,n} = c_{i,n}(\widehat\gamma_n(m)),
$

where

\begin{displaymath}
c_{i,n}(\gamma) = \left\{
\begin{array}{ll}
{2^{\gamma+1}-1 ...
... 0 \\
\ & \\
(i+1)2^{-(i+2)} & \gamma = 0
\end{array}\right.
\end{displaymath}

if $ \gamma\ge -1/2$ and

$\displaystyle c_{i,n}(\gamma) = c_{i,n}(-(1+\gamma))
$

if $ \gamma < -1/2.$


13.6.5 Hill Estimator


{alpha, sigma} = 32119 hillgp1 (x, k)
applies the Hill estimator to the k largest order statistics of the vector x and returns the estimated shape and scale parameter of the GP1 model

The celebrated Hill estimator is a maximum likelihood estimator for the GP1 submodel of Pareto dfs $ W_{1,\alpha,0,t}$ with left endpoint $ t$. It is given by

$\displaystyle \widehat\alpha_{n,k}=\left(
{1 \over k-1} \sum_{i=1}^{k-1} \log {x_{n-i+1:n} \over x_{n-k+1:n}}
\right)^{-1}.
$

Recall that $ t=x_{n-k+1:n}$ is used as the threshold. Notice that $ W_{1,\alpha,\mu,t-\mu}$ are further Pareto dfs with left endpoint equal to $ t$. When $ \mu=0$, then we are in the above-mentioned submodel. When $ \mu \ne 0$, then the Hill estimator can be inaccurate. Therefore, one should be cautious when the Hill estimator is applied to real data.


13.6.6 ML Estimator for Exponential Distributions


{mu, sigma} = 32186 mlegp0 (x, k)
applies the ML estimator for the exponential distributions (GP0) to the k largest order statistics of the vector x and returns the estimated location and scale parameter of the GP0 model

The maximum likelihood estimator for the exponential distributions, based on the $ k$ largest values, is given by


$\displaystyle \widehat\sigma_{n,k}$ $\displaystyle =$ $\displaystyle {1\over k-1} \sum_{i=1}^{k-1} x_{n-i+1:n} - x_{n-k+1:n},$  
$\displaystyle \widehat\mu_{n,k}$ $\displaystyle =$ $\displaystyle x_{n-k+1:n} - \widehat\sigma_{n,k}\log(n/k).$  


13.6.7 Selecting a Threshold by Means of a Diagram


r = 32268 momentgpdiag (x, k)
evaluates the moment estimator for all number of extremes given in the vector k
r = 32271 mlegpdiag (x, k)
evaluates the MLE of the GP model for all number of extremes given in the vector k
r = 32274 pickandsgpdiag (x, k)
evaluates the Pickands estimator for all number of extremes given in the vector k
r = 32277 dpgpdiag (x, k)
evaluates the Drees-Pickands estimator for all number of extremes given in the vector k
r = 32280 hillgp1diag (x, k)
evaluates the Hill estimator for all number of extremes given in the vector k

A visual tool to facilitate the selection of a threshold $ t$ (or, likewise, the number of upper extremes) is the diagram of estimates $ k \to \widehat\alpha_{n,k}$ or $ k \to \widehat\gamma_{n,k}.$ For small values of $ k$ one recognizes a high random fluctuation of the estimator, while for large values of $ k$, there is typically a bias due to a model deviation. Within an intermediate range, the estimates often stabilize around a value which gives a hint for the selection of the number of extremes. Of course, one should also apply QQ-plots and empirical mean excess functions to justify the choice of the threshold. In the statistical literature one can also find the advice to take the upper ten percent of a sample. Hydrologists take the 3-4 highest flood peaks in a water year. The automatic choice of a threshold is presently a hot research topic.

A diagram option is provided for each estimator of the shape parameter of a GP distribution. The corresponding calls are listed above. These quantlets return a vector with the estimates for each value of $ k$. Thus, to plot a diagram of the Hill estimates based on a simulated Fréchet data set with shape parameter $ \alpha=1$, one can use the commands

  x = ev1data (1, 500)
  line (2:500~hillgp1diag (x,2:250))
The output of the above commands is shown in Figure 13.2. One can see that after a strong fluctuation for small values of $ k$ the estimates are close to the true parameter $ \alpha=1$. Moreover, a bias for large values of $ k$ is visible.

Figure 13.2: Diagram of Hill estimates applied to Fréchet data $ G_{1,1}$.
\includegraphics[scale=0.425]{xtrfig3}