Having estimated the function
, it is natural
to ask whether the estimate
is significantly different from a parametric function obtained by a
parametric GLM fit.
In the simplest case, this means to consider
![]() |
![]() |
![]() |
|
![]() |
![]() |
![]() |
We will discuss two approaches here: Hastie & Tibshirani (1990) propose to use the difference of the deviances of the linear and the semiparametric model, respectively, and to approximate the degrees of freedom in the semiparametric case. The asymptotic behavior of this method is unknown, though. Härdle, Mammen & Müller (1998) derive an asymptotic normal distribution for a slightly modified test statistic.
In the following we denote the semiparametric estimates
by
and
the parametric estimates by
.
A natural approach is to compare both estimates
by a likelihood ratio test statistic
This test statistic can be used in the semiparametric case, too.
However, an approximate number of degrees of freedom
needs to be defined for the GPLM. The basic idea
is as follows.
Recall that
is the deviance
in the observations
and fitted values
,
see (5.19).
Abbreviate the estimated index
and consider the
adjusted dependent variable
.
If at convergence of the iterative estimation
with a linear operator
, then
Property (7.29) holds for backfitting
and the generalized Speckman estimator
with matrices and
, respectively.
A direct application to the profile likelihood algorithm is not
possible because of the more involved estimation of the nonparametric
function
. However a workable approximation can be obtained
by using
The direct comparison of the semiparametric estimates
and the parametric estimates
can be misleading because
has a non-negligible smoothing bias,
even under the linearity hypothesis. Hence, the key idea is
to use the
estimate
which introduces a smoothing
bias to
.
This estimate can be obtained from the updating procedure for
on the parametric estimate.
Note that here the second argument of
should be
the parametric estimate of
instead of
which means to apply the smoothing step according to (7.8)
to the artificial data set consisting of
.
Using this ``bias-adjusted'' parametric estimate
,
one can form the
test statistic
Both test statistics
and
have the same asymptotic normal distribution if the profile
likelihood algorithm is used. (A
approximation
does not hold in this case since kernel smoother matrices are
not projection operators.)
It turns out that the normal approximation does not work well.
Therefore, for the calculation of quantiles, it is recommended
to use a bootstrap approximation of the quantiles of the test
statistic:
![]() |
![]() |
![]() |
|
![]() |
![]() |
![]() |
![]() |
0.20 | 0.30 | 0.40 | ||
![]() |
0.066 | 0.048 | 0.035 | ||
![]() |
0.068 | 0.047 | 0.033 | ||
![]() |
0.073 | 0.062 | 0.068 | ||
![]() |
0.068 | 0.048 | 0.035 | ||
![]() |
0.074 | 0.060 | 0.052 |
Table 7.3 shows the results of the application of the
different test statistics for different choices of
the bandwidth .
As we have seen in the simulations, the likelihood ratio test statistic
and the modified test statistic
in combination with bootstrap
give very similar results.
The number of bootstrap simulations has been chosen as
.
Linearity is clearly rejected (at 10% level) for all bandwidths
from
to
.
The different behavior of the tests for different gives some indication
of possible deviance of
from linear functions.
The appearance of small wiggles of small length seems not to be significant
for the bootstrap (
). Also, the bootstrapped
still
rejects large values of
. This is due to the comparison of
the semiparametric estimator with a bias corrected parametric
one, yielding more independence of the bandwidth.
Partial linear models were first considered by Green & Yandell (1985), Denby (1986), Speckman (1988) and Robinson (1988b). For a combination with spline smoothing see also Schimek (2000a), Eubank et al. (1998) and the monograph of Green & Silverman (1994).
The extension of the partial linear and additive models to generalized regression models with link function is mainly considered in Hastie & Tibshirani (1986) and their monograph Hastie & Tibshirani (1990). They employed the observation of Nelder & Wedderburn (1972) and McCullagh & Nelder (1989) that the parametric GLM can be estimated by applying a weighted least squares estimator to the adjusted dependent variable and modified the LS estimator in a semi-/nonparametric way. Formal asymptotic results for the GPLM using Nadaraya-Watson type smoothing were first obtained by Severini & Wong (1992) and applied to this specific model by Severini & Staniswalis (1994). An illustration for the use of the profile likelihood and its efficiency is given by Staniswalis & Thall (2001).
The theoretical ideas for testing the GPLM using a likelihood ratio test and approximate degrees of freedom go back to Buja et al. (1989) and Hastie & Tibshirani (1990). The bootstrap procedure for comparing parametric versus nonparametric functions was formally discussed in Härdle & Mammen (1993). The theoretical results of Härdle, Mammen & Müller (1998) have been empirically analyzed by Müller (2001).