6.3 Computing GPLM Estimates

Currently six types of distributions are supported by the gplm quantlib: Binomial, Normal (Gaussian), Poisson, Gamma (includes Exponential), Inverse Gaussian and Negative Binomial (includes Geometric). Table 6.2 summarizes the models which are available.

Table 6.2: Supported models.
Distribution Model Code Link Function
Gaussian "noid" identity link (canonical)
  "nopow" power link
Binomial "bilo" Logistic link (Logit, canonical)
  "bipro" Gaussian link (Probit)
  "bicll" complementary log-log link
Poisson "polog" logarithmic link (canonical)
  "popow" power link
Gamma "gacl" reciprocal link (canonical)
  "gapow" power link
Inv. Gaussian "igcl" squared reciprocal link (canonical)
  "igpow" power link
Neg. Binomial "nbcl" canonical link
  "igpow" power link


The quantlet in the gplm quantlib which is mainly responsible for GPLM estimation is 11931 gplmest .


6.3.1 Estimation


g = 12089 gplmest (code, x, t, y, h {, opt})
estimates a GPLM

The quantlet 12092 gplmest provides a convenient way to estimate a GPLM. The standard call is quite simple, for example

  g=gplmest("bipro",x,t,y,h)
estimates a probit model (binomial with Gaussian cdf link). For 12095 gplmest the short code of the model (here "bipro" ) needs to be given, this is the same short code as for the glm quantlib. Additionally to the data, a bandwidth parameter h needs to be given (a vector corresponding to the dimension of t or just a scalar).

The result of the estimation is assigned to the variable g which is a list containing the following output:

g.b
the estimated parameter vector
g.bv
the estimated covariance of g.b
g.m
the estimated nonparametric function
g.stat
contains the statistics (see Section 6.5).

Recalling our credit scoring example from Subsection 6.2.2, the estimation--using a logit link--would be done as follows:

  t=log(t)               ; logs of amount and age
  trange=max(t)-min(t)
  t=(t-min(t))./trange   ; transformation to [0,1]
  library("gplm")

  h=0.4
  g=gplmest("bilo",x,t,y,h)
  g.b
12101 XAGgplm02.xpl

Now we can inspect the estimated coefficients in g.b
  Contents of b
  [1,]  0.96516 
  [2,]  0.74628 
  [3,] -0.049835

A graphical output can be created by calling

  gplmout("bilo",x,t,y,h,g.b,g.bv,g.m,g.stat)
12107 XAGgplm02.xpl

for the current example (cf. Figure 6.1). For more features of 12112 gplmout see Subsections 6.4.7 and  6.5.2.

Figure 6.1: GPLM output display.
\includegraphics[scale=0.58]{gplmoutput}

Optional parameters must be given to 12116 gplmest in a list of optional parameters. A detailed description of what is possible can be found in Section 6.4, which deals with the quantlet 12119 gplmopt . Set:

  opt=gplmopt("meth",1,"shf",1)
  opt=gplmopt("xvars",xvars,opt)
  opt=gplmopt("tg",grid(0|0,0.05|0.05,21|21),opt)
12123 XAGgplm03.xpl

This will create a list opt of optional parameters. In the first call, opt is created with the first component meth (estimation method) containing the value 1 (profile likelihood algorithm) and the second component shf (show iteration) set to 1 (``true''). In the second call, the variable names for the linear part of the model are appended to opt. Finally, a grid component tg (for the estimation of the nonparametric part) is defined.

We repeat the estimation with these settings:

  g=gplmest("bilo",x,t,y,h,opt)
12129 XAGgplm03.xpl

This instruction now computes using profile likelihood algorithm (in contrast to the default Speckman algorithm used in example 12134 XAGgplm02.xpl ), shows the iteration in the output window and estimates the function $ m(\bullet)$ on the grid tg. The output g contains one more element now:
g.mg
the estimated nonparametric function on the grid

Since the nonparametric function $ m(\bullet)$ is estimated on two-dimensional data, we can display a surface plot using the estimated function on the grid:

  library("plot")
  mg=setmask(sort(tg~g.mg),"surface")
12138 XAGgplm03.xpl

Figure 6.2 shows this surface together with a scatterplot of amount and age. The scatterplot shows that the big peak of $ \widehat{m}(\bullet)$ is caused by only a few observations. For the complete XploRe code of this example check the file 12147 XAGgplm03.xpl .

Figure: Scatterplot for amount and age (left). Estimate $ \widehat{m}$ (right).
\includegraphics[scale=0.3]{gplmscatter} \includegraphics[scale=0.3]{gplmfunction}

The estimated coefficients are slightly different here, since we used the profile likelihood instead of the Speckman algorithm in this case. Figure 6.3 shows the output window for the second estimation.


6.3.2 Estimation in Expert Mode


g = 12299 gplmcore (code, x, t, y, h, wx, wt, wc, b0, m0, off, ctrl{, upb, tg, m0g})
estimates a GPLM in expert mode
The 12302 gplmcore quantlet is the most inner ``kernel'' of the GPLM estimation. It does not provide optional parameters in the usual form of an option list as described in Section 6.4. Also, no check is done for erroneous input. Hence, this routine can be considered to use in expert mode. It speeds up computations and might be useful in simulations, pilot estimation for other procedures or Monte Carlo methods.

The following lines show how 12305 gplmcore could be used in our running example. Note that all data needs to be sorted by the first column of t.

  n=rows(x)
  p=cols(x)
  q=cols(t)

  tmp=sort(t~y~x)      ; sort data by first column of t
  t=tmp[,(1:q)]
  y=tmp[,(q+1)]
  x=tmp[,(q+2):cols(tmp)]

  shf   =  1      ; show iteration (1="true")
  miter = 10      ; maximal number of iterations
  cnv   =  0.0001 ; convergence criterion
  fscor =  0      ; Fisher scoring (1="true")
  pow   =  0      ; power for power link (if useful)
  nbk   =  1      ; k for neg. binomial (if useful)
  meth  =  0      ; algorithm ( -1 = backfitting,
                  ;              0 = Speckman
                  ;              1 = profile likelihood )
  ctrl=shf|miter|cnv|fscor|pow|nbk|meth

  wx    = 1       ; prior or frequency weights 
  wt    = 1       ; trimming weights for estimation of b
  wc    = 1       ; weights for the convergence criterion
  off   = 0       ; offset

  l=glmcore("bilo",x~t~matrix(n),y,wx,off,ctrl[1:6])
  b0=l.b[1:p]
  m0=l.b[p+q+1]+t*l.b[(p+1):(p+q)]

  h=0.4|0.4
  g=gplmcore("bilo",x,t,y,h,wx,wt,wc,b0,m0,off,ctrl)
12309 XAGgplm04.xpl

Optionally, 12314 gplmcore can estimate the function $ m(\bullet)$ on a grid, if tg and m0g are given. In addition, 12317 gplmcore can be also used to compute the biased parametric estimate which is needed for the specification test in Subsection 6.5.3. In this case the optional parameter upb should be set to 0 (default is 1).