|
Partially linear
eiv
models relate a response to predictors
with mean function
, where the regressors
are
measured with additive errors, that is,
![]() |
![]() |
![]() |
|
![]() |
![]() |
![]() |
(3.8) |
Here, we only introduce the conclusions. The related proofs and discussions can be found in Liang, Härdle, and Carroll (1999).
In EIV linear regression, inconsistency caused by the measurement error can be
overcome by applying the so-called correction for attenuation.
In our context, this suggests that we use the estimator
In some cases, we assume that the model
errors
are homoscedastic with common variance
.
In this event, since
and
, we define
![]() |
![]() |
The technique of partial replication is adopted when
is
unknown and must be estimated. That is, we observe
.
We consider here only the usual case that
, and assume that
a fraction
of the data has such replicates.
Let
be the sample mean of the replicates.
Then a consistent, unbiased method of moments estimate for
is
![]() |
The limit distribution of (3.11) is
, with
The quantlet
eivplmnor
estimates the parameters of partially
linear
eiv
model, with the assumption that the conditional distribution of
given
and
is normally distributed.
We show the following example:
library("xplore") library("eiv") n = 100 randomize(n) sigma = 0.0081 b = 1|2 p = rows(b) x = 2.*uniform(n,p)-1 ; latent variable t = sort(2.*uniform(n)-1,1) ; observable variable w = x+sqrt(sigma)*uniform(n) ; manifest variable m = 0.5*cos(pi.*t)+0.5*t y = x*b+m+normal(n)./2 h=0.5 sf = eivplmnor(w,t,y,sigma,h) b~sf.b ; estimates of b and g(t) dds = createdisplay(1,1) datah1=t~m datah2=t~sf.m part=grid(1,1,rows(t))' setmaskp(datah1,1,0,1) setmaskp(datah2,4,0,3) setmaskl(datah1,part,1,1,1) setmaskl(datah2,part,4,1,3) show(dds,1,1,datah1,datah2)
A partially linear fit for is computed.
sf.b contains the coefficients for the linear part.
sf.m contains the estimated nonparametric part
evaluated at observations t, see Figure 3.7.
There the thin curve line represents true data and the thick one does
the nonparametric estimates.
We now use the quantlet
eivplmnor
to calculate practical data from
the Framingham heart study. In this data set, the response variable
is the average blood pressure in a fixed 2-year
period,
, the age and
, the logarithm of the
observed cholesterol level, for which there are two replicates.
For the purpose of illustration, we only use the first cholesterol measurement.
The measurement error variance is obtained in the previous analysis.
The estimate of is 9.438 with the standard error
.
For nonparametric fitting, we choose the bandwidth
using cross-validation to predict the response.
Precisely we compute the squared error using a geometric sequence
of 191 bandwidths ranging in
. The optimal bandwidth is selected
to minimize the square error among these 191 candidates.
An analysis ignoring measurement error found some curvature in
,
see Figure 3.8 for the estimate of
.