In this section, we give some properties of the e.d.r. directions. The e.d.r. directions coincide with the eigenvectors of the so-called Hessian matrix of the regression function. Slightly different from the usual one, the outer product of gradients simplifies the calculation and extends the ADE of Härdle and Stoker (1989) to the case of more than one direction. For ease of exposition, we always assume the eigenvectors of a semi-positive matrix be arranged according to their corresponding eigenvalues in descending order.
Consider the relation between and
in model (3.4).
The
-indexing e.d.r. directions are as defined in
(3.5). Let
and
denote the gradients of the function
with respect to the arguments of
. Under
model (3.4), i.e.
, we
have
. Therefore
![]() |
Lemma 3.1 provides a simple method to estimate
through the eigenvectors of
. Härdle and Stoker (1989) noted
this fact in passing but seemed to have stopped short of
exploiting it. Instead, they proposed the so-called ADE method,
which suffers from the disadvantages as stated in Section
3.1. Li (1992) proposed the principal
Hessian directions (pHd) method by estimating the Hessian matrix
of
. For a normally distributed design
, the
Hessian matrix can be properly estimated simply by the Stein's
Lemma. (Cook (1998) claimed that the result can be
extended to symmetric design
). However, in time series
analysis, the assumption of symmetric design
is frequently
violated. As an example, see (3.28) and Figure 3.5
in the next section. Now, we propose a direct estimation method as
follows. Suppose that
is a
random sample. First, estimate the gradients
by local polynomial smoothing. Thus, we consider the
local
-th order polynomial fitting in the form of the
following minimization problem
![]() |
![]() |
![]() |
Similar results were obtained by Härdle and Stoker (1989) for
the ADE method. Note that the optimal bandwidth for the estimation
of the regression function (or the derivatives) in the sense of
the mean integrated squared errors (MISE) is
. The fastest convergence rate for the
estimator of the directions can never be achieved at the bandwidth
of this value, but is actually achieved at
, which is smaller than
. This point has
been noticed by many authors in other contexts. See for example
Hall (1984), Weisberg and Welsh (1994), and Carroll et al. (1997).
Let
and
. Then model (3.7) can be
written as
. It is easy to see that
![]() |
Similarly, we can estimate the gradient
using a nonparametric method. For example, if we use the local
linear smoothing method, we can estimate the gradients by solving
the following minimization problem
![]() |