6.3 Computing GPLM Estimates
Currently six types of distributions are supported by the
gplm
quantlib: Binomial,
Normal (Gaussian), Poisson, Gamma (includes Exponential),
Inverse Gaussian and Negative Binomial (includes Geometric).
Table 6.2 summarizes the
models which are available.
Table 6.2:
Supported models.
Distribution |
Model Code |
Link Function |
Gaussian |
"noid" |
identity link (canonical) |
|
"nopow" |
power link |
Binomial |
"bilo" |
Logistic link (Logit, canonical) |
|
"bipro" |
Gaussian link (Probit) |
|
"bicll" |
complementary log-log link |
Poisson |
"polog" |
logarithmic link (canonical) |
|
"popow" |
power link |
Gamma |
"gacl" |
reciprocal link (canonical) |
|
"gapow" |
power link |
Inv. Gaussian |
"igcl" |
squared reciprocal link (canonical) |
|
"igpow" |
power link |
Neg. Binomial |
"nbcl" |
canonical link |
|
"igpow" |
power link |
|
The quantlet in the
gplm
quantlib
which is mainly responsible for GPLM estimation is
gplmest
.
6.3.1 Estimation
- g =
gplmest
(code, x, t, y, h {, opt})
- estimates a GPLM
|
The quantlet
gplmest
provides a convenient
way to estimate a GPLM.
The standard call is quite simple, for example
g=gplmest("bipro",x,t,y,h)
estimates a probit model (binomial with Gaussian cdf link). For
gplmest
the short code of the model (here "bipro" )
needs to be given, this is the same short code as for the
glm
quantlib.
Additionally to the data, a bandwidth parameter h needs
to be given (a vector corresponding to the dimension of t
or just a scalar).
The result of the estimation is assigned to the variable g
which is a list containing the following output:
- g.b
- the estimated parameter vector
- g.bv
- the estimated covariance of g.b
- g.m
- the estimated nonparametric function
- g.stat
- contains the statistics (see Section 6.5).
Recalling our credit scoring example from Subsection 6.2.2,
the estimation--using a logit link--would be done as follows:
t=log(t) ; logs of amount and age
trange=max(t)-min(t)
t=(t-min(t))./trange ; transformation to [0,1]
library("gplm")
h=0.4
g=gplmest("bilo",x,t,y,h)
g.b
Now we can inspect the estimated coefficients in g.b
Contents of b
[1,] 0.96516
[2,] 0.74628
[3,] -0.049835
A graphical output can be created by calling
gplmout("bilo",x,t,y,h,g.b,g.bv,g.m,g.stat)
for the current example (cf. Figure 6.1).
For more features of
gplmout
see Subsections 6.4.7 and 6.5.2.
Figure 6.1:
GPLM output display.
|
Optional parameters must be given to
gplmest
in a list
of optional parameters. A detailed description of what
is possible can be found in Section 6.4, which deals
with the quantlet
gplmopt
. Set:
opt=gplmopt("meth",1,"shf",1)
opt=gplmopt("xvars",xvars,opt)
opt=gplmopt("tg",grid(0|0,0.05|0.05,21|21),opt)
This will create a list opt of optional parameters.
In the first call, opt is
created with the first component meth (estimation method)
containing the value 1 (profile likelihood algorithm)
and the second component
shf (show iteration) set to 1 (``true'').
In the second call, the variable names for the
linear part of the model are appended to opt.
Finally, a grid component tg
(for the estimation of the nonparametric part)
is defined.
We repeat the estimation with these settings:
g=gplmest("bilo",x,t,y,h,opt)
This instruction now computes using profile likelihood
algorithm (in contrast to the default Speckman algorithm used in
example
XAGgplm02.xpl
), shows the iteration in the output window
and estimates the function
on the grid tg. The output g contains one more
element now:
- g.mg
- the estimated nonparametric function on the grid
Since the nonparametric function
is estimated
on two-dimensional data, we can display a surface plot
using the estimated function on the grid:
library("plot")
mg=setmask(sort(tg~g.mg),"surface")
Figure 6.2 shows this surface together
with a scatterplot of amount and age.
The scatterplot shows that the big peak of
is caused by only a few observations.
For the complete
XploRe
code of this example check the file
XAGgplm03.xpl
.
Figure:
Scatterplot for amount and age (left).
Estimate
(right).
|
The estimated coefficients are slightly different here, since we used
the profile likelihood instead of the Speckman algorithm in this
case. Figure 6.3 shows the output window
for the second estimation.
6.3.2 Estimation in Expert Mode
- g =
gplmcore
(code, x, t, y, h, wx, wt, wc, b0, m0, off, ctrl{, upb, tg, m0g})
- estimates a GPLM in expert mode
|
The
gplmcore
quantlet is the most inner ``kernel'' of the GPLM estimation. It
does not provide optional parameters in the usual form
of an option list as described in Section 6.4. Also, no check
is done for erroneous input. Hence, this routine can be considered to use
in expert mode. It speeds up computations and might be useful in simulations,
pilot estimation for other procedures or Monte Carlo methods.
The following lines show how
gplmcore
could be used in
our running example. Note that all data needs to be sorted by the
first column of t.
n=rows(x)
p=cols(x)
q=cols(t)
tmp=sort(t~y~x) ; sort data by first column of t
t=tmp[,(1:q)]
y=tmp[,(q+1)]
x=tmp[,(q+2):cols(tmp)]
shf = 1 ; show iteration (1="true")
miter = 10 ; maximal number of iterations
cnv = 0.0001 ; convergence criterion
fscor = 0 ; Fisher scoring (1="true")
pow = 0 ; power for power link (if useful)
nbk = 1 ; k for neg. binomial (if useful)
meth = 0 ; algorithm ( -1 = backfitting,
; 0 = Speckman
; 1 = profile likelihood )
ctrl=shf|miter|cnv|fscor|pow|nbk|meth
wx = 1 ; prior or frequency weights
wt = 1 ; trimming weights for estimation of b
wc = 1 ; weights for the convergence criterion
off = 0 ; offset
l=glmcore("bilo",x~t~matrix(n),y,wx,off,ctrl[1:6])
b0=l.b[1:p]
m0=l.b[p+q+1]+t*l.b[(p+1):(p+q)]
h=0.4|0.4
g=gplmcore("bilo",x,t,y,h,wx,wt,wc,b0,m0,off,ctrl)
Optionally,
gplmcore
can estimate the function
on a grid, if tg and m0g are given. In addition,
gplmcore
can be also used to compute the biased
parametric estimate which is needed for the specification
test in Subsection 6.5.3. In this case the optional
parameter upb should be set to 0 (default is 1).