|
gplmest
provides a number of statistical characteristics
of the estimated model in the output component stat.
The quantlet
gplmstat
can be used to create the above mentioned statistics by
hand.
Suppose we have input x, y and have estimated the
vector of coefficients b (with covariance bv) and
the nonparametric curve m
by model "nopow". Then the list of statistics
can be found from
stat=gplmstat("nopow",x,y,b,bv,m,df)Of course, an list of options opt can be added at the end. If options from opt have been used for the estimation, these should be included for
The following characteristics are contained in the output stat. This itself is a list and covers the components
Sometimes, one or the other statistic may not be available, when it was not applicable. This can always be checked by searching for the components in stat:
names(stat)The quantlet
|
An output display containing statistical characteristics
and a plot of the fitted link function
can be obtained by
gplmout
.
Recall our example from Section 6.3:
opt=gplmopt("meth",1,"shf",1) opt=gplmopt("xvars",xvars,opt) opt=gplmopt("tg",grid(0|0,0.05|0.05,21|21),opt) g=gplmest("bilo",x,t,y,h,opt)The optional component xvars will be used in the output display:
gplmout("bilo",x,t,y,h,g.b,g.bv,g.m,g.stat,opt)
The optional parameters that can be used to modify the
result from
gplmout
can be found in Subsection 6.4.7.
|
To assess the estimated model it might be useful to check significance of single parameter values, or of linear combinations of parameters. To compare two different, nested models a sort of likelihood ratio (LR) test can be performed using the test statistic
GLM tutorial for more information on the LR test.
A modified likelihood ratio test for testing
(GLM) against
(GPLM)
was introduced by Härdle, Mammen, and Müller (1998). They
propose to use
a ``biased'' parametric estimate
instead of
and
the test statistic
![]() |
(6.7) |
![]() |
(6.8) |
Let us continue
with our credit scoring example and test whether the correct
specification of the model is
or
.
The following code computes first the GLM and applies the
quantlet
gplmbootstraptest
to estimate the GPLM and
perform the bootstrap test.
library("glm") ; GLM estimation n=rows(x) opt=glmopt("xvars",xvars|tvars|"constant") l=glmest("bilo",x~t~matrix(n),y,opt) glmout("bilo",x~t~matrix(n),y,l.b,l.bv,l.stat,opt) library("gplm") ; GPLM estimation and test h=0.4 nboot=10 randomize(742742) opt=gplmopt("meth",1,"shf",1,"xvars",xvars) opt=gplmopt("wr",prod((abs(t-0.5) < 0.40*trange),2),opt) g=gplmbootstraptest("bilo",x,t,y,h,nboot,opt) gplmout("bilo",x,t,y,h,g.b,g.bv,g.m,g.stat,opt)
4pt
|
The obtained significance levels for the test (computed for
all three test statistics ,
and
) can be found in the component alpha
of the result g.
Note that the approximations
and
(the latter in particular) may give bad results when the sample
size
is small.
If we run the test with random seed
742742 and nboot=250 we get:
Contents of alpha [1,] 0.035857 [2,] 0.035857 [3,] 0.043825The hypothesis GLM can hence be rejected (at 5% level for
It is also possible to test more complicated GLMs against the
GPLM. For example, the nonlinear influence of amount and age
could be caused by an interaction of these two variables.
Consider now the GLM
hypothesis
.
The code for this test needs to define an optional design
matrix tdesign which is used instead of the default
t~matrix(n)
in the previous test. The essential changes
are as follows:
tdesign=t~prod(t,2)~matrix(n) opt=gplmopt("tdesign",tdesign,opt) g=gplmbootstraptest("bilo",x,t,y,h,nboot,opt)
Contents of alpha [1,] 0.052 [2,] 0.056 [3,] 0.064The hypothesis, that the correct specification is a GLM with interaction term, can hence be rejected as well (now at 10% level for
Note that
gplmbootstraptest
also prints a warning, if missing values occurred in the bootstrap
procedure. In our last example we have:
[1,] ====================================================== [2,] WARNING! [3,] ====================================================== [4,] Missing values in bootstrap encountered! [5,] The actually used bootstrap sample sizes are: [6,] nboot[1] = 249 ( 99.60%) [7,] nboot[2] = 249 ( 99.60%) [8,] nboot[3] = 249 ( 99.60%) [9,] ======================================================Missing values are mostly due to numerical errors when the sample size is small or the dataset contains outliers.