Keywords - Function groups - @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Library: gplm
See also: gplmest gplmcore gplmopt gplminit gplmstat glmest

Quantlet: gplmbootstraptest
Description: Bootstrap test for comparing GLM vs. GPLM. The hypothesis E[y|x,t] = G(x*b + t*g + c) is tested against the alternative E[y|x,t] = G(x*b + m(t)). This routine offers a convenient interface for GPLM estimation and testing. A preparation of data is performed (inclusive sorting).

Reference(s):

Link:
Usage: myfit = gplmbootstraptest(code,x,t,y,h,nboot{,opt})
Input:
code text string, the short code for the model (e.g. "bilo" for logit or "noid" for ordinary PLM).
x n x p matrix, the discrete predictor variables.
t n x q matrix, the continuous predictor variables.
y n x 1 vector, the response variables.
h q x 1 vector, the bandwidth vector.
nboot integer, number of bootstrap replications. If nboot<=0, the test is performed using the asymptotic normal distribution of the test statistics.
opt optional, a list with optional input. "gplmopt" can be used to set up this parameter. The order of the list elements is not important.
opt.b0 p x 1 vector, the initial coefficients. If not given, all coefficients are put =0 initially.
opt.m0 n x 1 vector, the initial values for the nonparametric part. If not given, a default is used.
opt.tg ng x 1 vector, a grid for continuous part. If tg is given, the nonparametric function will also be computed on this grid.
opt.tdesign n x r matrix, design for parametric fit for m(t). This allows to test e.g. quadratic or cubic functions against m(t). If not given a linear function (incl. constant) will be tested by using the design matrix(n)~t.
opt.weights string, type of observation weights. Can be "frequency" for replication counts, or "prior" (default) for prior weights in weighted regression.
opt.wx scalar or n x 1 vector, frequency or prior weights. If not given, set to 1.
opt.wt n x 1 vector, weights for t (trimming factors). If not given, set to 1.
opt.wc n x 1 vector, weights for convergence criterion, w.r.t. m(t) only. If not given, opt.wt is used.
opt.wr n x 1 vector, weights for test statistics. If not given, set to 1.
opt.off scalar or n x 1 vector, offset. Can be used for constrained estimation. If not given, set to 0.
opt.meth integer, if -1, a backfitting is performed, if 1 a profile likelihood method is used, and 0 a simple profile likelihood is used. The default is 0.
opt.fscor integer, if exists and =1, a Fisher scoring is performed (instead of the default Newton-Raphson procedure). This parameter is ignored for canonical links.
opt.shf integer, if exists and =1, some output is produced which indicates how the iteration is going on.
opt.nosort integer, if exists and =1, the continuous variables t and the grid tg are assumed to be sorted by the 1st column. Sorting is required by the algorithm, hence you should switch if off only when the data are already sorted.
opt.miter integer, maximal number of iterations. The default is 10.
opt.cnv integer, convergence criterion. The default is 0.0001.
opt.pow scalar, power for power link. If not given, set to 0.
opt.nbk scalar, extra parameter k for negative binomial distribution. If not given, set to 1 (geometric distribution).
Output:
myfit.b k x 1 vector, estimated coefficients
myfit.bv k x k matrix, estimated covariance matrix for b
myfit.m n x 1 vector, estimated nonparametric part
myfit.mg ngx 1 vector, estimated nonparametric part on grid
myfit.rr 3 x 1 vector, 3 test statistics according to Haerdle/Mammen/Mueller
myfit.alpha 3 x 1 vector, significance level for rejection of the parametric hypothesis (for each of the three test statisctics).
myfit.stat list with the following statistics:
myfit.stat.deviance deviance,
myfit.stat.pearson generalized pearson's chi^2,
myfit.stat.loglik log-likelihood,
myfit.stat.r2 pseudo R^2,
myfit.stat.it 2 x 1 vector, number of iterations needed in semiparametric and biased parametric fit
myfit.stat.ret scalar, return code: 0 o.k., 1 maximal number of iterations reached in estimation (if applicable), -1 missing values have been encountered in estimation, -2 missing values in test statistics encountered. -3 missing values in bootstrap encountered.
myfit.stat.rrboot nboot x 3 matrix, values of the bootstrap test statistics (if applicable).

Note:

Example:
library("gplm")
;=============================
;  simulate data
;=============================
n=100
b=1|2
p=rows(b)
x=2.*uniform(n,p)-1
t=sort(2.*uniform(n)-1,1)
m=cos(pi.*t)
y=( 1./(1+exp(-x*b-m)).>uniform(n) )
;=============================
;  parametric(logit) fit
;=============================
pf=glmest("bilo",x~t~matrix(n),y)
b0  =pf.b[1:p]
gamma0=pf.b[p+1:rows(pf.b)]
m0  =(t~matrix(n))*gamma0
;=============================
;  semiparametric fit & test
;=============================
h=0.6
opt=list(b0,m0)
sf=gplmbootstraptest("bilo",x,t,y,h,10,opt)
b~b0~sf.b
sf.alpha
;==========================
;  plot
;==========================
library("plot")
true=setmask(sort(t~m),"line","thin")
linm=setmask(sort(t~m0),"line","red")
estm=setmask(sort(t~sf.m),"line","blue")
plot(true,estm)

Result:
A generalized partially linear logit fit for E[y|x,t] is
computed and tested against the parametric logit.
sf.b contains the coefficients for the linear
part. sf.m contains the estimated nonparametric part
evaluated at observations t. The example gives the true
b together with the logit estimate b0 and the GPLM
estimate sf.b. Also the estimated function sf.m is
displayed together with the true and the linear fit.
sf.del contains the test results for the 3 test statistics
proposed in Haerdle/Mammen/Mueller (1996).



Author: M. Mueller, 20010228
(C) MD*TECH Method and Data Technologies, 05.02.2006