7.3 Noninteractive Quantlets for Estimation


m = 14074 intest (t, y, h, g, loc{, opt})
estimates an additive model (AM)
{m, b, const} = 14077 intestpl (x, t, y, h, g, loc{, opt})
estimates an additive partially linear model (APLM)
{m, b, const} = 14080 backfit (t, y, h, loc, kern{, opt})
estimates an additive and additive partially linear model
m = 14083 gintest (code, t, y, h, g, loc{, opt})
estimates a generalized additive model (GAM)
{m, b, bv, const} = 14086 gintestpl (code, x, t, y, h, g{, opt})
estimates a generalized additive partially linear model (GAPLM)
m = 14089 intest2d (t, y, h, g, loc{, opt})
estimates a bivariate marginal influence
{fh, c} = 14092 interact (t, y, h, g, loc, incl{, tg})
estimates an additive model with interaction terms
m = 14095 fastint (t, y, h1, h2, loc{, tg})
estimates an additive model using marginal integration
Here is the list of all quantlets. Their use is described in the following subsections.


7.3.1 Estimating an Additive Model


m = 14159 intest (t, y, h, g, loc{, opt})
estimates an additive model (AM)

  library("gam")
  randomize(1234)
  t     = uniform(50,2)*2-1
  g1    = 2*t[,1]
  g2    = t[,2]^2
  g2    = g2 - mean(g2)
  y     = g1 + g2  + normal(50,1) * sqrt(0.25)
  h     = #(1.2, 1.0)
  g     = #(1.4, 1.2)
  loc   = 1
  gest  = intest(t,y,h,g,loc)
  gest
  bild  = createdisplay(1,2)
  dat11 = t[,1]~g1
  dat12 = t[,1]~gest[,1]
  dat21 = t[,2]~g2
  dat22 = t[,2]~gest[,2]
  setmaskp(dat12,4,4,8)
  setmaskp(dat22,4,4,8)
  show(bild,1,1,dat11,dat12)
  show(bild,1,2,dat21,dat22)
14163 XAGgam02.xpl

The quantlet 14168 intest provides a way to estimate the univariate additive functions and derivatives of a separable additive model using Nadaraya-Watson, local linear or local quadratic estimation.

Input parameters:

h
$ p'\times 1$ bandwidth vector for the directions of interest (see remarks). It can be $ p'=p$, $ p'=pg$ or $ p'=1$ for the same bandwidth in all directions.
g
$ p \times 1$ bandwidth vector for the directions not of interest
loc
scalar specifying the estimation procedure:
0
--Nadaraya-Watson (local constant)
1
--local linear
2
--local quadratic

Optional parameters (see Section 7.5):

opt.tg
$ ng \times pg$ matrix, a grid for continuous part If tg is given, the nonparametric function will be computed on this grid
opt.shf
scalar (show-how-far). If it exists and is equal to one, an output is produced and it indicates how the iteration is going on (additive function/point of estimation/number of iteration).

Output value:

m
$ n(ng) \times p'\cdot(loc+1)\times q$ matrix containing the marginal integration estimates in the first p' columns, followed by the 1-st and 2-nd derivative if local linear or quadratic estimation is used

Remarks: The grid may have less dimensions ($ p'$) than the explanatory data. The estimation will then be run on the first $ p'$ directions of interest. Consequently, it is possible to specify the bandwidth vector $ h$ for the directions of interest only.


7.3.2 Estimating an Additive Partially Linear Model


{m, b, const} = 14253 intestpl (x, t, y, h, g, loc{, opt})
estimates an additive partially linear model (APLM)

  library("gam")
  randomize(1345)
  loc= 2
  x = matrix(50,2)
  t = uniform(50,2)*2-1
  xh = uniform(50,2)
  x[,1]= 3*(xh>=0.8)+2*((0.8>xh)&&(xh>=0.3))+(0.3>xh)
  x[,2]= (xh>(1/3))
  g1    = 2*t[,1]
  g2    = (2*t[,2])^2
  g2    = g2 -mean(g2)
  m     = g1 + g2 + x*(0.2|-1.0)
  y     = m + normal(50,1)*0.25
  h     = #(1.4, 1.4)
  g     = #(1.4, 1.4)
  {m,b,const} = intestpl(x,t,y,h,g,loc)
  b
  const
  bild =createdisplay(1,2)
  dat11= t[,1]~g1
  dat12= t[,1]~m[,1]
  setmaskp(dat12,4,4,8)
  show(bild,1,1,dat11,dat12)
  dat21= t[,2]~g2
  dat22= t[,2]~m[,2]
  setmaskp(dat22,4,4,8)
  show(bild,1,2,dat21,dat22)
14257 XAGgam03.xpl

The quantlet 14262 intestpl estimates the univariate additive functions and its derivatives in an additive partially linear model (APLM) using local linear or local quadratic estimation.

Input parameters:

h
$ p \times 1$ vector or a scalar, the bandwidth for the directions of interest
g
$ p \times 1$ vector or a scalar, the bandwidth for the directions not of interest
loc
scalar indicating the estimation procedure:
1
--local linear
2
--local quadratic

Optional parameters (see Section 7.5):

opt.tg
$ ng \times pg$ matrix, a grid for continuous part. If tg is given, the nonparametric function will be computed on this grid.
opt.shf
scalar, if shf=1 then it is shown how the process is going on (default: shf=0)

Output values:

m
$ ng\times p\cdot(loc+1)\times q$ matrix, the marginal integration estimates in the first p columns, followed by the 1-st and 2-nd derivative, if local linear or quadratic is used
b
$ d \times 1$ vector, the coefficients of the linear part
const
scalar, the constant in the additive partial linear model


7.3.3 Estimating Additive and Additive Partially Linear Model


{m, b, const} = 14344 backfit (t, y, h, loc, kern{, opt})
estimates an additive and additive partially linear model

  library("gam")
  randomize(1)
  n   = 100
  t   = normal(n,2)             ; explanatory variable
  x   = normal(n,2)             ; the linear part
  f1  = - sin(2*t[,1])          ; estimated functions
  f2  = t[,2]^2
  eps = normal(n,1) * sqrt(0.75)
  y   = x[,1] - x[,2]/4 + f1 + f2 +eps      ; response variable
  h   = 0.5
  opt = gamopt("x",x,"shf",1)   ; the linear part is used
  ;                               and the iterations will be shown
  {m,b,const} = backfit(t,y,h,0,"qua",opt)
  ;
  b                             ; coefficients for the linear part
  ;                               ([1, -1/4] were used)
  const                         ; estimation of the constant 
  ;
  pic = createdisplay(1,2)      ; preparing the graphical output
  d1  = t[,1]~m[,1]
  d2  = t[,2]~m[,2]
  setmaskp(d1,4,4,4)
  setmaskp(d2,4,4,4)
  m1  = mean(f1)
  m2  = mean(f2)
  yy  = y - x*b - const
  x1  = t[,1]~(yy - m[,2])
  x2  = t[,2]~(yy - m[,1])
  setmaskp(x1,1,11,4)
  setmaskp(x2,1,11,4) 
  setmaskl(d1,(sort(d1~(1:rows(d1)))[,3])',4,1,1)
  setmaskl(d2,(sort(d2~(1:rows(d2)))[,3])',4,1,1)
  show(pic,1,1,d1,x1,t[,1]~(f1-m1))
  show(pic,1,2,d2,x2,t[,2]~(f2-m2))
14348 XAGgam04.xpl

The quantlet 14353 backfit estimates the univariate additive functions and its derivatives in an additive (AM) or additive partially linear model (APLM) using the backfitting algorithm. It accepts only one-dimensional response variables y.

Input parameters:

h
$ p \times 1$ vector or a scalar, the bandwidth
loc
scalar indicates the estimation procedure:
0
--Nadaraya-Watson (local constant)
1
--local linear
2
--local quadratic
kern
string indicates the kernel function
"qua"
quartic kernel
"epa"
Epanechnikov kernel
"gau"
Gaussian kernel

Optional parameters:

opt.x
$ n\times d$ matrix, the explanatory variables for the linear part (at least the discrete variables)
opt.shf
shf=1 to show how the iteration is going on (default: shf=0)
opt.miter
scalar, the maximal number of iterations (default: miter=50)
opt.cnv
scalar, the convergence criterion (default: cnv=$ 0.000001$)

The quantlet returns

m
$ n\times p\cdot(loc+1)$ matrix, the estimate of the additive functions in column 1 to $ p$, the first derivatives in column $ (p+1)$ to $ 2p$ and the second derivatives in column $ (2p+1)$ to $ 3p$
b
$ d \times 1$ vector, the coefficients of the linear part
const
scalar, the estimate of the constant in the model

The example to this quantlet ( 14356 XAGgam04.xpl ) produces the following graphical output:

\includegraphics[scale=0.55]{gam_backfit}
It can be seen original data (crosses), exact values of the estimated functions (circles) and their estimations (small triangles connected by lines).


7.3.4 Estimating a Generalized Additive Model


m = 14452 gintest (code, t, y, h, g, loc{, opt})
estimates a generalized additive model (GAM)

  library("gam")
  randomize(1235)
  n     = 100
  p     = 2
  t     = uniform(n,p)*2-1
  g1    = 2*t[,1]
  g2    = t[,2]^2
  g2    = g2 - mean(g2)
  m     = g1 + g2
  y     = cdfn(m) .> uniform(n)    ; probit model
  h     = #(1.7, 1.5)
  g     = #(1.7, 1.5)
  tg    = grid(-0.8,0.1,19)
  opt   = gamopt("tg",tg~tg,"shf",1)
  loc   = 1
  code  = "bipro"
  m     = gintest(code,t,y,h,g,loc,opt)
  d1    = tg[,1]~m[,1]
  d2    = tg[,2]~m[,2]
  setmaskp(d1,4,4,8)
  setmaskp(d2,4,4,8)
  bild  = createdisplay(1,2)
  show(bild,1,1,d1,t[,1]~g1)
  show(bild,1,2,d2,t[,2]~g2)
14456 XAGgam05.xpl

The quantlet 14461 gintest estimates the univariate additive functions and its derivatives in a generalized additive model (GAM) using Nadaraya-Watson, local linear or local quadratic estimation.

Input parameters:

code
string, specifies the distribution of y and the link function. Currently implemented codes are:
"bilo" 
binomial with logistic link (logit)
"bipro"
binomial with normal distribution link (probit)
"noid" 
normal with canonical (identity) link
h
$ p'\times 1$ bandwidth vector for the directions of interest (see remarks). It can be $ p'=p$, $ p'=pg$ or $ p'=1$ for the same bandwidth in all directions.
g
$ p \times 1$ bandwidth vector for the directions not of interest
loc
scalar specifies the estimation procedure:
0
--Nadaraya-Watson (local constant)
1
--local linear
2
--local quadratic

Optional parameters:

opt.tg
$ ng \times pg$ matrix, a grid for the continuous part (see remarks)
opt.shf
for shf=1 an indicator to show how the process is going on (default: shf=0)

The quantlet returns

m
$ ng \times p'\cdot (loc+1) \times q$, the marginal integration estimates in the first p' columns, followed by the 1-st and 2-nd derivative, if local linear or quadratic estimation is used

Remarks: The grid may have less dimensions $ p'$ than the explanatory data. The estimation will then be run on the first $ p'$ directions of interest. Consequently, you need to specify the bandwidth vector h for the directions of interest only.


7.3.5 Estimating a Generalized Additive Partially Linear Model


{m, b, bv, const} = 14583 gintestpl (code, x, t, y, h, g{, opt})
estimates a generalized additive partially linear model (GAPLM)

  library("gam")
  randomize(1235)
  n     = 100
  p     = 2
  d     = 2
  b     = 1|2
  t     = uniform(n,p)*2-1
  x     = 2.*uniform(n,d)-1
  g1    = 2*t[,1]
  g2    = t[,2]^2
  g2    = g2 - mean(g2)
  m     = g1 + g2
  y     = cdfn(m+x*b) .> uniform(n)    ; probit model
  h     = #(1.7, 1.5)
  g     = #(1.7, 1.5)
  tg    = grid(-0.8,0.1,18)
  opt   = gamopt("tg",tg~tg)
  opt   = gamopt("shf",1,opt)
  code  = "bipro"
  {m,b,bv,c} = gintestpl(code,x,t,y,h,g,opt)
  gamout(t,y,m,b,c,gamopt("pl",1,"x",x,"bv",bv,opt))
14587 XAGgam06.xpl

The quantlet 14592 gintestpl estimates the univariate additive functions in a generalized additive partially linear model (GAPLM) using Newton-Raphson or Fisher scoring algorithm.

Input parameters:

code
string specifying the distribution of y and the link function. It accepts only one-dimensional y. Currently implemented codes are:
binomial
"bilo" 
binomial with logistic link (logit)
"bipro"
binomial with normal distribution link (probit)
"bicll"
binomial with complementary log-log link
normal
"noid" 
normal with canonical=identity link
"nopow"
normal with power (inverse) link
gamma
"gacl" 
gamma with canonical=reciprocal (inverse) link
"gapow"
gamma poisson with power (inverse) link
inverse gaussian
"igcl" 
inverse gaussian with canonical=squared reciprocal (inverse) link
"igpow"
inverse gaussian with power (inverse) link
negative binomial
"nbcl" 
negative binomial with canonical (inverse) link
"nbpow"
negative binomial with power (inverse) link
h
$ p \times 1$ bandwidth vector for the directions of interest
g
$ p \times 1$ bandwidth vector for the directions not of interest

Optional parameters:

opt.tg
$ np \times p$ matrix to estimate on a grid
opt.shf
if shf=1 then it is shown how the process is going on (default: shf=0)
opt.b0
$ d \times 1$ vector to provide initial coefficients for the linear part (default: GLM pre-estimation)
opt.nosort
nosort=1 indicates that t is already sorted by its first column (default: nosort=0). Sorting is required by the algorithm, hence you should switch if off only when the data are already sorted.
opt.miter
maximal number of iterations (default: miter=10)
opt.cnv
scalar to determine the convergence criterion (default: cnv=0.0001)
opt.fscor
fscor=1 to switch to the Fisher-Scoring algorithm (default: Newton-Raphson). This parameter is ignored for canonical links.
opt.wx
scalar or $ n \times 1 $ vector to make use of prior weights. For binomial models usually the binomial index vector (default: 1).
opt.wt
$ n \times 1 $ vector, weights for t (trimming factors) (default: all components set to 1)
opt.wtc
$ n \times 1 $ vector to apply weights for the convergence criterion, w.r.t. $ m(t)$ (default: wt is used)
opt.off
scalar or $ n \times 1 $ vector, offset, can be used for constrained estimation (default: off=0)
opt.pow
scalar, power for power link (default: pow=0)
opt.nbk
scalar, extra parameter k for negative binomial distribution (default: nbk=1--geometric distribution)

The quantlet returns

m
$ ng \times p$ matrix, the marginal integration estimates
b
$ d \times 1$ vector, the coefficients of the linear part
bv
$ d\times d$ covariance matrix for the estimated coefficients
const
constant of the model


7.3.6 Estimating Bivariate Marginal Influence


m = 14683 intest2d (t, y, h, g, loc{, opt})
estimates a bivariate marginal influence


  library("gam")
  randomize(12345)
  t     = grid(#(-0.9,-0.9),#(0.2,0.2),#(10,10))
  n     = rows(t)
  t     = t~(uniform(n)*2-1)
  g3    = sin(2*t[,3])
  g12   = t[,1].*t[,2]^2
  y     = g3 + g12 + normal(n)*sqrt(0.5)
  h     = #(1.0, 1.0)
  g     = #(1.1, 1.1, 1.2)
  loc   = 1
  gest  = intest2d(t,y,h,g,loc)
  library("graphic")
  pic  = createdisplay(1,2)
  dat11 = grsurface(t[,1:2]~g12)
  dat12 = grsurface(t[,1:2]~gest[,1])
  gc = grcube( dat11|dat12 )
  show(pic,1,1,dat11,gc.box,gc.x,gc.y,gc.z,gc.c)
  show(pic,1,2,dat12,gc.box,gc.x,gc.y,gc.z,gc.c)
  setheadline(pic, 1, 1, "Original function")
  setheadline(pic, 1, 2, "Estimated function")
14687 XAGgam07.xpl

The quantlet 14692 intest2d provides a way to estimate the bivariate marginal influence function of the explanatory variables $ t_{1}$ and $ t_{2}$. You can choose the Nadaraya-Watson, the local linear or the local quadratic kernel smoother. Further, if local linear is chosen the program gives you the first derivative functions for both directions. If you choose the local quadratic smoother, you get the mixed derivative function. This quantlet can be used e.g. to explore the joint influence of two arbitrary explanatory variables in a multidimensional regression problem.

Input parameters:

h
scalar or $ 2\times 1$ vector, the bandwidth for the directions of interest
g
scalar or $ p \times 1$ vector, the bandwidth for the directions not of interest
loc
scalar specifying the estimation procedure:
0
--Nadaraya-Watson (local constant)
1
--local linear
2
--local quadratic

Optionally it is possible to use:

opt.tg
$ ng \times 2$ matrix for estimating on a grid
opt.shf
shf=1 to show how the process is going on (default: shf=0)

The quantlet returns

m
$ ng \times p' \times g$matrix, the bivariate marginal integration estimate in the first column, the derivatives in the following columns

The example from this quantlet ( 14695 XAGgam07.xpl ) gives the following picture:

\includegraphics[scale=0.55]{gam_intest2d}

The original function is displayed on the left side, its estimate on the right side.


7.3.7 Estimating an Additive Model with Interaction Terms


{fh, c} = 14793 interact (t, y, h, g, loc, incl{, tg})
estimates an additive model with interaction terms

  library("gam")
  randomize(12345)
  t     = grid(#(-0.9,-0.9),#(0.2,0.2),#(10,10))
  n     = rows(t)
  t     = t~(uniform(n)*2-1)
  g1    = 2*t[,1]
  g2    = t[,2]^2 - mean(t[,2]^2)
  g3    = sin(3*t[,3])
  g12   = t[,1].*t[,2]
  y     = g1+g2+g3+g12+normal(n)*sqrt(0.5)
  h     = #(0.9, 0.9, 0.9)
  g     = #(1.0, 1.0, 1.0)
  incl  = 1~2
  f     = interact(t,y,h,g,1,incl)
  library("graphic")
  pic   = createdisplay(2,2)
  dat11 = sort(t[,2]~g2)
  datf1 = sort(t[,2]~f.fh[,2])
  dat12 = sort(t[,3]~g3)
  datf2 = sort(t[,3]~f.fh[,3])
  setmaskp(dat11,1,3,8)
  setmaskp(dat12,1,3,8)
  setmaskp(datf1,4,3,8)
  setmaskp(datf2,4,3,8)
  setmaskl(datf1,(1:rows(datf1))',4,1,1)
  setmaskl(datf2,(1:rows(datf2))',4,1,1)
  show(pic,1,1,dat11,datf1)
  show(pic,1,2,dat12,datf2)
  dat21 = grsurface(t[,1:2]~g12)
  dat22 = grsurface(t[,1:2]~f.fh[,4])
  gc = grcube( dat21|dat22 )
  show(pic,2,1,dat21,gc.box,gc.x,gc.y,gc.z,gc.c)
  show(pic,2,2,dat22,gc.box,gc.x,gc.y,gc.z,gc.c)
14797 XAGgam08.xpl

The quantlet 14802 interact estimates the univariate functions and the bivariate interaction terms wished by the user and the constant of the model, i.e., all functions $ f_j$ and $ f_{jk}$ of the model $ m = c
+f_1+\ldots +f_d+f_{12}+\ldots+f_{(d-1)d}$, see also Subsection 7.1.2. Again the marginal integration estimator is used and you can choose between the Nadaraya-Watson, the local linear and the local quadratic smoother.

Input parameters:

h
scalar or $ p \times 1$ vector, the bandwidth for the directions of interest
g
scalar or $ p \times 1$ vector, the bandwidth for the directions not of interest
loc
scalar specifying the estimation procedure:
0
--Nadaraya-Watson (local constant)
1
--local linear
2
--local quadratic
incl
$ pp \times 2$ matrix giving all pairs of indices $ j,k$ for which $ f_{jk}$ shall be included

Optional parameters:

tg
$ ng \times p$ matrix to estimate on a grid (see remarks)

The quantlet returns

fh
$ ng \times (p+pp)$ matrix, the marginal integration estimates of the univariate functions and the chosen interaction terms
c
scalar, the constant of the model

The example 14805 XAGgam08.xpl gives the following picture:

\includegraphics[scale=0.55]{gam_interact}
You see displayed the second and third additive component in the upper plots, where the original functions are black and their estimates blue. In the lower plots are displayed the original interaction on the left and its estimate on the right.

Remarks: Note that 14809 interact accepts only one-dimensional y. If you choose a grid tg, the interaction functions can only be estimated up to a constant shift.


7.3.8 Estimating an Additive Model Using Marginal Integration


m = 14901 fastint (t, y, h1, h2, loc{, tg})
estimates an additive model using marginal integration

  library("gam")
  randomize(1234)
  n = 100
  d = 2
  ;               generate a correlated design:
  var = 1.0
  cov = 0.4  *(matrix(d,d)-unit(d)) + unit(d)*var
  {eval, evec} = eigsm(cov)
  t = normal(n,d)
  t = t*((evec.*sqrt(eval)')*evec')
  g1    = 2*t[,1]
  g2    = t[,2]^2 -mean(t[,2]^2)
  y     = g1 + g2  + normal(n,1) * sqrt(0.5)
  h1    = 0.5
  h2    = 0.7
  loc   = 0
  gest  = fastint(t,y,h1,h2,loc)
  library("graphic")
  pic   = createdisplay(1,2)
  dat11 = t[,1]~g1
  dat12 = t[,1]~gest[,1]
  dat21 = t[,2]~g2
  dat22 = t[,2]~gest[,2]
  setmaskp(dat12,4,4,8)
  setmaskp(dat22,4,4,8)
  show(pic,1,2,dat11,dat12)
  show(pic,1,1,dat21,dat22)
14905 XAGgam09.xpl

The quantlet 14910 fastint estimates the univariate additive components $ f_j$ if and only if the true model is of additive structure, i.e., the underlying model is $ m = c
+f_1+\ldots+f_d$. Here, the marginal integration estimator is applied and followed by a one-step-backfit . For the backfit step you can choose between the Nadaraya-Watson, the local linear and the local quadratic smoother. Consequently you get estimates for the first or for the first and the second derivatives. For the integration step we use the fully internalized smoother, see Subsection 7.1.2.

This estimation procedure is very fast compared to the above mentioned integration procedures but we recommend to use it only if the number of observations is big compared to the number of covariates and if the true model is indeed additive. It accepts only higher-dimensional y variables.


Input parameters:

h1
scalar or a $ p \times 1$ vector, the bandwidth for the pilot estimation,(marginal integration); it is recommended to undersmooth here
h2
scalar or a $ p \times 1$ vector, the bandwidth for the backfit step
loc
scalar specifying the estimation procedure:
0
--Nadaraya-Watson (local constant)
1
--local linear
2
--local quadratic

Optionally it is possible to use:

tg
$ ng \times pg$ matrix for estimating on a grid

The quantlet returns

m
$ ng \times (p+pp)$ matrix, the estimates of the univariate additive components and their derivatives on t or tg, respectively