7.3 Noninteractive Quantlets for Estimation
- m =
intest
(t, y, h, g, loc{, opt})
- estimates an additive model (AM)
- {m, b, const} =
intestpl
(x, t, y, h, g, loc{, opt})
- estimates an additive partially linear model (APLM)
- {m, b, const} =
backfit
(t, y, h, loc, kern{, opt})
- estimates an additive and additive partially linear model
- m =
gintest
(code, t, y, h, g, loc{, opt})
- estimates a generalized additive model (GAM)
- {m, b, bv, const} =
gintestpl
(code, x, t, y, h, g{, opt})
- estimates a generalized additive partially linear model (GAPLM)
- m =
intest2d
(t, y, h, g, loc{, opt})
- estimates a bivariate marginal influence
- {fh, c} =
interact
(t, y, h, g, loc, incl{, tg})
- estimates an additive model with interaction terms
- m =
fastint
(t, y, h1, h2, loc{, tg})
- estimates an additive model using marginal integration
|
Here is the list of all quantlets. Their use is described in
the following subsections.
7.3.1 Estimating an Additive Model
- m =
intest
(t, y, h, g, loc{, opt})
- estimates an additive model (AM)
|
library("gam")
randomize(1234)
t = uniform(50,2)*2-1
g1 = 2*t[,1]
g2 = t[,2]^2
g2 = g2 - mean(g2)
y = g1 + g2 + normal(50,1) * sqrt(0.25)
h = #(1.2, 1.0)
g = #(1.4, 1.2)
loc = 1
gest = intest(t,y,h,g,loc)
gest
bild = createdisplay(1,2)
dat11 = t[,1]~g1
dat12 = t[,1]~gest[,1]
dat21 = t[,2]~g2
dat22 = t[,2]~gest[,2]
setmaskp(dat12,4,4,8)
setmaskp(dat22,4,4,8)
show(bild,1,1,dat11,dat12)
show(bild,1,2,dat21,dat22)
The quantlet
intest
provides a way to estimate the
univariate additive functions and derivatives of a separable
additive model using Nadaraya-Watson, local linear or local
quadratic estimation.
Input parameters:
- h
-
bandwidth vector for the directions of interest (see
remarks). It can be
,
or
for the same bandwidth in all
directions.
- g
bandwidth vector for the directions not of
interest
- loc
- scalar specifying the estimation procedure:
- 0
- --Nadaraya-Watson (local constant)
- 1
- --local linear
- 2
- --local quadratic
Optional parameters (see Section 7.5):
- opt.tg
-
matrix, a grid for continuous part
If tg is given, the nonparametric function will be computed on this grid
- opt.shf
- scalar (show-how-far).
If it exists and is equal to one,
an output is
produced and it indicates how the iteration is going on
(additive function/point of estimation/number of iteration).
Output value:
- m
-
matrix
containing the marginal integration estimates in the first p' columns,
followed by the 1-st and 2-nd derivative if local linear or quadratic estimation is used
Remarks: The grid may have less dimensions (
) than the explanatory
data. The estimation will then be run on the first
directions of interest.
Consequently, it is possible to specify the bandwidth vector
for the directions of interest only.
7.3.2 Estimating an Additive Partially Linear Model
- {m, b, const} =
intestpl
(x, t, y, h, g, loc{, opt})
- estimates an additive partially linear model (APLM)
|
library("gam")
randomize(1345)
loc= 2
x = matrix(50,2)
t = uniform(50,2)*2-1
xh = uniform(50,2)
x[,1]= 3*(xh>=0.8)+2*((0.8>xh)&&(xh>=0.3))+(0.3>xh)
x[,2]= (xh>(1/3))
g1 = 2*t[,1]
g2 = (2*t[,2])^2
g2 = g2 -mean(g2)
m = g1 + g2 + x*(0.2|-1.0)
y = m + normal(50,1)*0.25
h = #(1.4, 1.4)
g = #(1.4, 1.4)
{m,b,const} = intestpl(x,t,y,h,g,loc)
b
const
bild =createdisplay(1,2)
dat11= t[,1]~g1
dat12= t[,1]~m[,1]
setmaskp(dat12,4,4,8)
show(bild,1,1,dat11,dat12)
dat21= t[,2]~g2
dat22= t[,2]~m[,2]
setmaskp(dat22,4,4,8)
show(bild,1,2,dat21,dat22)
The quantlet
intestpl
estimates the univariate additive functions and its derivatives in an additive partially
linear model (APLM) using
local linear or local quadratic estimation.
Input parameters:
- h
-
vector or a scalar, the bandwidth for the directions of
interest
- g
-
vector or a scalar, the bandwidth for the directions
not of interest
- loc
- scalar indicating the estimation procedure:
- 1
- --local linear
- 2
- --local quadratic
Optional parameters (see Section 7.5):
- opt.tg
-
matrix, a grid for continuous part.
If tg is given, the nonparametric function will be computed on this grid.
- opt.shf
- scalar, if shf=1 then it is shown how the process is going
on (default: shf=0)
Output values:
- m
-
matrix, the marginal
integration estimates in the first p columns, followed by the 1-st and 2-nd derivative,
if local linear or quadratic is used
- b
-
vector, the coefficients of the linear
part
- const
- scalar, the constant in the additive partial linear model
7.3.3 Estimating Additive and Additive Partially Linear Model
- {m, b, const} =
backfit
(t, y, h, loc, kern{, opt})
- estimates an additive and additive partially linear model
|
library("gam")
randomize(1)
n = 100
t = normal(n,2) ; explanatory variable
x = normal(n,2) ; the linear part
f1 = - sin(2*t[,1]) ; estimated functions
f2 = t[,2]^2
eps = normal(n,1) * sqrt(0.75)
y = x[,1] - x[,2]/4 + f1 + f2 +eps ; response variable
h = 0.5
opt = gamopt("x",x,"shf",1) ; the linear part is used
; and the iterations will be shown
{m,b,const} = backfit(t,y,h,0,"qua",opt)
;
b ; coefficients for the linear part
; ([1, -1/4] were used)
const ; estimation of the constant
;
pic = createdisplay(1,2) ; preparing the graphical output
d1 = t[,1]~m[,1]
d2 = t[,2]~m[,2]
setmaskp(d1,4,4,4)
setmaskp(d2,4,4,4)
m1 = mean(f1)
m2 = mean(f2)
yy = y - x*b - const
x1 = t[,1]~(yy - m[,2])
x2 = t[,2]~(yy - m[,1])
setmaskp(x1,1,11,4)
setmaskp(x2,1,11,4)
setmaskl(d1,(sort(d1~(1:rows(d1)))[,3])',4,1,1)
setmaskl(d2,(sort(d2~(1:rows(d2)))[,3])',4,1,1)
show(pic,1,1,d1,x1,t[,1]~(f1-m1))
show(pic,1,2,d2,x2,t[,2]~(f2-m2))
The quantlet
backfit
estimates the univariate additive functions
and its derivatives in an additive (AM) or additive partially linear model
(APLM) using the backfitting algorithm.
It accepts only one-dimensional response variables y.
Input parameters:
- h
vector or a scalar, the bandwidth
- loc
- scalar indicates the estimation procedure:
- 0
- --Nadaraya-Watson (local constant)
- 1
- --local linear
- 2
- --local quadratic
- kern
- string indicates the kernel function
- "qua"
- quartic kernel
- "epa"
- Epanechnikov kernel
- "gau"
- Gaussian kernel
Optional parameters:
- opt.x
matrix, the explanatory variables for the linear part
(at least the discrete variables)
- opt.shf
- shf=1 to show how the iteration is going on (default:
shf=0)
- opt.miter
- scalar, the maximal number of iterations (default:
miter=50)
- opt.cnv
- scalar, the convergence criterion (default: cnv=
)
The quantlet returns
- m
-
matrix, the estimate of the additive functions
in column 1 to
, the first derivatives in column
to
and the second
derivatives in column
to
- b
-
vector, the coefficients of the linear
part
- const
- scalar, the estimate of the constant in the model
The example to this quantlet (
XAGgam04.xpl
) produces the following graphical
output:
It can be seen original data (crosses), exact values of the estimated
functions (circles) and their estimations (small triangles connected by
lines).
7.3.4 Estimating a Generalized Additive Model
- m =
gintest
(code, t, y, h, g, loc{, opt})
- estimates a generalized additive model (GAM)
|
library("gam")
randomize(1235)
n = 100
p = 2
t = uniform(n,p)*2-1
g1 = 2*t[,1]
g2 = t[,2]^2
g2 = g2 - mean(g2)
m = g1 + g2
y = cdfn(m) .> uniform(n) ; probit model
h = #(1.7, 1.5)
g = #(1.7, 1.5)
tg = grid(-0.8,0.1,19)
opt = gamopt("tg",tg~tg,"shf",1)
loc = 1
code = "bipro"
m = gintest(code,t,y,h,g,loc,opt)
d1 = tg[,1]~m[,1]
d2 = tg[,2]~m[,2]
setmaskp(d1,4,4,8)
setmaskp(d2,4,4,8)
bild = createdisplay(1,2)
show(bild,1,1,d1,t[,1]~g1)
show(bild,1,2,d2,t[,2]~g2)
The quantlet
gintest
estimates the univariate additive functions and its derivatives in a generalized additive model
(GAM) using
Nadaraya-Watson, local linear or local quadratic estimation.
Input parameters:
- code
- string, specifies the distribution of y and the link
function. Currently implemented codes are:
- "bilo"
- binomial with logistic link (logit)
- "bipro"
- binomial with normal distribution link (probit)
- "noid"
- normal with canonical (identity) link
- h
-
bandwidth vector for the directions of interest (see
remarks). It can be
,
or
for the same bandwidth in all
directions.
- g
-
bandwidth vector for the directions not of
interest
- loc
- scalar specifies the estimation procedure:
- 0
- --Nadaraya-Watson (local constant)
- 1
- --local linear
- 2
- --local quadratic
Optional parameters:
- opt.tg
-
matrix, a grid for the continuous part (see
remarks)
- opt.shf
- for shf=1 an indicator to show how the process is going on
(default: shf=0)
The quantlet returns
- m
-
, the marginal integration
estimates in the first p' columns, followed by the 1-st and 2-nd derivative,
if local linear or quadratic estimation is used
Remarks: The grid may have less dimensions
than the explanatory data.
The estimation will then be run on the first
directions of interest.
Consequently, you need to specify the bandwidth vector h
for the directions of interest only.
7.3.5 Estimating a Generalized Additive Partially Linear Model
- {m, b, bv, const} =
gintestpl
(code, x, t, y, h, g{, opt})
- estimates a generalized additive partially linear model (GAPLM)
|
library("gam")
randomize(1235)
n = 100
p = 2
d = 2
b = 1|2
t = uniform(n,p)*2-1
x = 2.*uniform(n,d)-1
g1 = 2*t[,1]
g2 = t[,2]^2
g2 = g2 - mean(g2)
m = g1 + g2
y = cdfn(m+x*b) .> uniform(n) ; probit model
h = #(1.7, 1.5)
g = #(1.7, 1.5)
tg = grid(-0.8,0.1,18)
opt = gamopt("tg",tg~tg)
opt = gamopt("shf",1,opt)
code = "bipro"
{m,b,bv,c} = gintestpl(code,x,t,y,h,g,opt)
gamout(t,y,m,b,c,gamopt("pl",1,"x",x,"bv",bv,opt))
The quantlet
gintestpl
estimates the univariate additive functions in a generalized additive partially linear model
(GAPLM) using Newton-Raphson or Fisher scoring algorithm.
Input parameters:
- code
- string specifying the distribution of y and the link
function. It accepts only one-dimensional y.
Currently implemented codes are:
- binomial
- "bilo"
- binomial with logistic link (logit)
- "bipro"
- binomial with normal distribution link (probit)
- "bicll"
- binomial with complementary log-log link
- normal
- "noid"
- normal with canonical=identity link
- "nopow"
- normal with power (inverse) link
- gamma
- "gacl"
- gamma with canonical=reciprocal (inverse) link
- "gapow"
- gamma poisson with power (inverse) link
- inverse gaussian
- "igcl"
- inverse gaussian with canonical=squared reciprocal (inverse) link
- "igpow"
- inverse gaussian with power (inverse) link
- negative binomial
- "nbcl"
- negative binomial with canonical (inverse) link
- "nbpow"
- negative binomial with power (inverse) link
- h
bandwidth vector for the directions of
interest
- g
bandwidth vector for the directions not of interest
Optional parameters:
- opt.tg
-
matrix to estimate on a grid
- opt.shf
- if shf=1 then it is shown how the process is going on
(default: shf=0)
- opt.b0
-
vector to provide initial coefficients for
the linear part (default: GLM pre-estimation)
- opt.nosort
- nosort=1 indicates that t is already sorted by
its first column (default: nosort=0). Sorting is required by the algorithm, hence
you should switch if off only when the data are already sorted.
- opt.miter
- maximal number of iterations (default: miter=10)
- opt.cnv
- scalar to determine the convergence criterion
(default: cnv=0.0001)
- opt.fscor
- fscor=1 to switch to the Fisher-Scoring algorithm
(default: Newton-Raphson). This parameter is ignored for canonical links.
- opt.wx
- scalar or
vector to make use
of prior weights. For binomial models usually the binomial index vector (default: 1).
- opt.wt
-
vector, weights for t (trimming factors)
(default: all components set to 1)
- opt.wtc
-
vector to apply weights for the convergence
criterion, w.r.t.
(default: wt is used)
- opt.off
- scalar or
vector, offset, can be used for
constrained estimation (default: off=0)
- opt.pow
- scalar, power for power link (default: pow=0)
- opt.nbk
- scalar, extra parameter k for negative binomial distribution
(default: nbk=1--geometric distribution)
The quantlet returns
- m
-
matrix, the marginal integration
estimates
- b
vector, the coefficients of the linear part
- bv
covariance matrix for the estimated
coefficients
- const
- constant of the model
7.3.6 Estimating Bivariate Marginal Influence
- m =
intest2d
(t, y, h, g, loc{, opt})
- estimates a bivariate marginal influence
|
library("gam")
randomize(12345)
t = grid(#(-0.9,-0.9),#(0.2,0.2),#(10,10))
n = rows(t)
t = t~(uniform(n)*2-1)
g3 = sin(2*t[,3])
g12 = t[,1].*t[,2]^2
y = g3 + g12 + normal(n)*sqrt(0.5)
h = #(1.0, 1.0)
g = #(1.1, 1.1, 1.2)
loc = 1
gest = intest2d(t,y,h,g,loc)
library("graphic")
pic = createdisplay(1,2)
dat11 = grsurface(t[,1:2]~g12)
dat12 = grsurface(t[,1:2]~gest[,1])
gc = grcube( dat11|dat12 )
show(pic,1,1,dat11,gc.box,gc.x,gc.y,gc.z,gc.c)
show(pic,1,2,dat12,gc.box,gc.x,gc.y,gc.z,gc.c)
setheadline(pic, 1, 1, "Original function")
setheadline(pic, 1, 2, "Estimated function")
The quantlet
intest2d
provides a way to estimate the bivariate
marginal influence function of the explanatory variables
and
. You can choose the Nadaraya-Watson, the local linear or the
local quadratic kernel smoother. Further, if local linear is chosen
the program gives you the first derivative functions for both
directions. If you choose the local quadratic smoother, you get the
mixed derivative function. This quantlet can be used e.g. to explore
the joint influence of two arbitrary explanatory variables in a
multidimensional regression problem.
Input parameters:
- h
- scalar or
vector, the bandwidth for the directions
of interest
- g
- scalar or
vector, the bandwidth for the
directions not of interest
- loc
- scalar specifying the estimation procedure:
- 0
- --Nadaraya-Watson (local constant)
- 1
- --local linear
- 2
- --local quadratic
Optionally it is possible to use:
- opt.tg
-
matrix for estimating on a grid
- opt.shf
- shf=1 to show how the process is going on
(default: shf=0)
The quantlet returns
- m
-
matrix, the bivariate
marginal integration estimate in the first column, the
derivatives in the following columns
The example from this quantlet (
XAGgam07.xpl
)
gives the following picture:
The original function is displayed on the left side, its
estimate on the right side.
7.3.7 Estimating an Additive Model with Interaction Terms
- {fh, c} =
interact
(t, y, h, g, loc, incl{, tg})
- estimates an additive model with interaction terms
|
library("gam")
randomize(12345)
t = grid(#(-0.9,-0.9),#(0.2,0.2),#(10,10))
n = rows(t)
t = t~(uniform(n)*2-1)
g1 = 2*t[,1]
g2 = t[,2]^2 - mean(t[,2]^2)
g3 = sin(3*t[,3])
g12 = t[,1].*t[,2]
y = g1+g2+g3+g12+normal(n)*sqrt(0.5)
h = #(0.9, 0.9, 0.9)
g = #(1.0, 1.0, 1.0)
incl = 1~2
f = interact(t,y,h,g,1,incl)
library("graphic")
pic = createdisplay(2,2)
dat11 = sort(t[,2]~g2)
datf1 = sort(t[,2]~f.fh[,2])
dat12 = sort(t[,3]~g3)
datf2 = sort(t[,3]~f.fh[,3])
setmaskp(dat11,1,3,8)
setmaskp(dat12,1,3,8)
setmaskp(datf1,4,3,8)
setmaskp(datf2,4,3,8)
setmaskl(datf1,(1:rows(datf1))',4,1,1)
setmaskl(datf2,(1:rows(datf2))',4,1,1)
show(pic,1,1,dat11,datf1)
show(pic,1,2,dat12,datf2)
dat21 = grsurface(t[,1:2]~g12)
dat22 = grsurface(t[,1:2]~f.fh[,4])
gc = grcube( dat21|dat22 )
show(pic,2,1,dat21,gc.box,gc.x,gc.y,gc.z,gc.c)
show(pic,2,2,dat22,gc.box,gc.x,gc.y,gc.z,gc.c)
The quantlet
interact
estimates the univariate
functions and the bivariate interaction terms wished by
the user and the constant of the model, i.e., all functions
and
of the model
, see also
Subsection 7.1.2. Again the marginal
integration estimator is used and you can choose
between the Nadaraya-Watson, the local linear and
the local quadratic smoother.
Input parameters:
- h
- scalar or
vector, the bandwidth for the directions
of interest
- g
- scalar or
vector, the bandwidth for the
directions not of interest
- loc
- scalar specifying the estimation procedure:
- 0
- --Nadaraya-Watson (local constant)
- 1
- --local linear
- 2
- --local quadratic
- incl
-
matrix giving all pairs of indices
for which
shall be included
Optional parameters:
- tg
-
matrix to estimate on a grid (see remarks)
The quantlet returns
- fh
-
matrix, the marginal integration estimates
of the univariate functions and the chosen interaction terms
- c
- scalar, the constant of the model
The example
XAGgam08.xpl
gives the following picture:
You see displayed the second and third additive
component in the upper plots, where the original
functions are black and their estimates blue. In the
lower plots are displayed the original interaction on the
left and its estimate on the right.
Remarks: Note that
interact
accepts only
one-dimensional y.
If you choose a grid tg, the interaction functions can
only be estimated up to a constant shift.
7.3.8 Estimating an Additive Model Using Marginal Integration
- m =
fastint
(t, y, h1, h2, loc{, tg})
- estimates an additive model using marginal integration
|
library("gam")
randomize(1234)
n = 100
d = 2
; generate a correlated design:
var = 1.0
cov = 0.4 *(matrix(d,d)-unit(d)) + unit(d)*var
{eval, evec} = eigsm(cov)
t = normal(n,d)
t = t*((evec.*sqrt(eval)')*evec')
g1 = 2*t[,1]
g2 = t[,2]^2 -mean(t[,2]^2)
y = g1 + g2 + normal(n,1) * sqrt(0.5)
h1 = 0.5
h2 = 0.7
loc = 0
gest = fastint(t,y,h1,h2,loc)
library("graphic")
pic = createdisplay(1,2)
dat11 = t[,1]~g1
dat12 = t[,1]~gest[,1]
dat21 = t[,2]~g2
dat22 = t[,2]~gest[,2]
setmaskp(dat12,4,4,8)
setmaskp(dat22,4,4,8)
show(pic,1,2,dat11,dat12)
show(pic,1,1,dat21,dat22)
The quantlet
fastint
estimates the univariate additive
components
if and only if the true model is of
additive structure, i.e., the underlying model is
. Here, the marginal integration estimator
is applied and followed by a one-step-backfit . For the
backfit step you can choose between the
Nadaraya-Watson, the local linear and the local
quadratic smoother. Consequently you get estimates for
the first or for the first and the second derivatives. For
the integration step we use the fully internalized
smoother, see Subsection 7.1.2.
This estimation procedure is very fast compared to the
above mentioned integration procedures but we
recommend to use it only if the number of observations
is big compared to the number of covariates and if the
true model is indeed additive. It accepts only
higher-dimensional y variables.
Input parameters:
- h1
- scalar or a
vector, the bandwidth for the
pilot estimation,(marginal integration); it is recommended to undersmooth
here
- h2
- scalar or a
vector, the bandwidth for the
backfit step
- loc
- scalar specifying the estimation procedure:
- 0
- --Nadaraya-Watson (local constant)
- 1
- --local linear
- 2
- --local quadratic
Optionally it is possible to use:
- tg
-
matrix for estimating on a grid
The quantlet returns
- m
-
matrix, the estimates of the univariate
additive components and their derivatives on t or tg, respectively