4.4 Behavior at the boundary

Near the boundary of the observation interval any smoothing method will become less accurate. At the boundary fewer observations can be averaged and thus variance or bias can be affected. Consider kernel weights; they become asymmetric as $x$ approaches the boundary points. This ``boundary effect'' is not present for $x$ in the interior of the observation interval, but for a small to moderate sample size, a significant proportion of the observation interval can be affected by the boundary behavior. Consider, for instance, the kernel smooth in Figure 3.2. The Gaussian kernel that has been used there is always truncated through boundary points. The whole observation interval is thus in a (strict) sense influenced by boundary effects. Note, however, that this kernel is effectively zero outside the range of three standard deviations, so a smaller proportion of the observations on each side are due to boundary effects.

In this section I describe the boundary effects and present a simple and effective solution to the boundary problem. This solution is due to Rice (1984b) and uses the (generalized) jackknifing technique. Boundary phenomena have also been discussed by Gasser and Müller (1979) and Müller (1984b) who proposed ``boundary kernels'' for use near the boundary. In the setting of spline smoothing Rice and Rosenblatt (1983) computed the boundary bias.

Consider the fixed design error model with kernels having support $[-1,1]$. Take the kernel estimator

\begin{displaymath}{\hat{m}}_h(x)=n^{-1}\sum_{i=1}^n K_h(x-X_i)\ Y_i,\end{displaymath}

which has expectation equal to (see Exercise 4.4)
\begin{displaymath}
\int_{(x-1)/h}^{x/h} K(u)\ m(x-u h)\ d u+O(n^{-1}h^{-1})
\end{displaymath} (4.4.21)

as $nh \to \infty$. In the middle of the observation interval there is no problem since for $h$ small, $x/h \ge 1$ and $(x-1)/h \le-1 $.

Now let $x=\rho h \le 1-h;$ then by a Taylor series expansion the expected value of ${\hat{m}}_h(x)$ can be approximated by

$\displaystyle {m(x)\ \int_{-1}^\rho K(u)\ du-h m'(x)\ \int_{-1}^\rho u\
du}$
    $\displaystyle +{1 \over 2}h^2 m''(x) \int_{-1}^\rho u^2 K(u)\ d u$  
  $\textstyle =$ $\displaystyle m(x)\omega_K (0,\rho)-h m'(x)\omega_K (1,\rho)$  
    $\displaystyle +{1 \over 2}h^2 m''(x)\omega_K (2,\rho).$ (4.4.22)

Of course, if $\rho \ge 1$,

\begin{eqnarray*}
\omega_K (0,\rho)&=&1,\cr
\omega_K (1,\rho)&=&0,\cr
\omega_K (2,\rho)&=&d_K,
\end{eqnarray*}



and we have the well-known bias expansion for the Priestley-Chao estimator. The idea of John Rice is to define a kernel depending on the relative location of $x$ expressed through the parameter $\rho$.

Asymptotic unbiasedness is achieved for a kernel

\begin{displaymath}K_\rho(\cdot)=K(\cdot)/\omega_K (0,\rho).\end{displaymath}

If $x$ is away from the left boundary, that is, $\rho \ge 1$, then the approximate bias is given by the third term. If $\rho<1$, the second term is of dominant order $O(h)$ and thus the bias is of lower order at the boundary than in the center of the interval.

The generalized jackknife technique (Gray and Schucany 1972) allows one to eliminate this lower order bias term. Let ${\hat{m}}_{h,\rho}(\cdot )$ be the kernel estimator with kernel $K_\rho$ and let

\begin{displaymath}{\hat{m}}_h^J(x)=(1-R) {\hat{m}}_{h,\rho}(x)+R {\hat{m}}_{\alpha h,\rho}(x)\end{displaymath}

be the jackknife estimator of $m(x)$, a linear combination of kernel smoothers with bandwidth $h$ and $\alpha h$. From the bias expansion 4.4.22, the leading bias term of ${\hat{m}}_h^J(x)$ can be eliminated if
\begin{displaymath}
R=-{\omega_K (1,\rho)/\omega_K (0,\rho) \over \alpha \omega_...
.../\omega_K (0,\rho/\alpha)-\omega_K (1,\rho)/\omega_K(0,\rho)}.
\end{displaymath} (4.4.23)

This technique was also used by Bierens (1987) to reduce the bias inside the observation interval. In effect, the jackknife estimator is using the kernel function
\begin{displaymath}
K_\rho^J(u)=(1-R)K(u)-(R/\alpha)K(u/\alpha),
\end{displaymath} (4.4.24)

where $R$ and $\alpha$ and thus $K_\rho^J$ depend on $\rho$. In this sense, $K_\rho^J$ can be interpreted as a ``boundary kernel''. Rice (1984b) has recommended the following for the choice of $\alpha$:

\begin{displaymath}\alpha=2-\rho.\end{displaymath}

As an example, take as the initial kernel the quartic kernel

\begin{displaymath}
K(u)=(15/16) (1-u^2)^2\ I(\left \vert u \right \vert \le 1).
\end{displaymath} (4.4.25)

The numbers $\omega_K (0,\rho),\ \omega_K(1,\rho)$ can be computed explicitly. Figure 4.11 shows the sequence of boundary kernels $K_\rho^J$ for $\rho=0.1,0.2,0.4,0.6,0.8$. Note that the kernels have negative side lobes. Figure 4.12 shows the nonparametric estimate of the function $m(x)=x^2$ from $n=15$ observations (Gaussian noise, $\sigma=0.05$). The bandwidth $h$ is $0.4$, thus $60$ percent of the observation interval are due to boundary effects.

Figure 4.11: Modified quartic boundary kernels $K^J_\rho $ for $\rho=0.0$, 0.2, 0.4, 0.6, 0.8. The symmetric kernel is the kernel $K^J_1$. From Rice (1984b) with permission of Marcel Dekker, Inc., New York.
\includegraphics[scale=0.2]{ANR4,11.ps}

Figure 4.12: Nonparametric estimate of $m(x)=x^2$, $n=15$, $\rho=0.05$, $h=0.4$, quartic kernel. The solid line is the true function, the dotted line is the unmodified kernel estimate. From Rice (1984b) with permission of Marcel Dekker, Inc., New York.
\includegraphics[scale=0.2]{ANR4,12.ps}

Exercises
4.4.1 Compute the constants $\omega_K (0,\rho), \omega_K (1,\rho),
\omega_K (2,\rho)$ from 4.4.22 for the quartic kernel. Construct an algorithm with bias correction at the boundary.

[Hint: The system XploRe (1989) contains this algorithm for the triweight kernel.]

4.4.2 Proof formula 4.4.21 by comparing

\begin{displaymath}E {\hat{m}}_h(x)=n^{-1} \sum_{i=1}^n K_h(x-X_i) m(X_i)\end{displaymath}

with

\begin{displaymath}n^{-1} \sum_{i=1}^n \int_{\Delta_i} K_h (x-u) m(u)d u,\end{displaymath}

where $\Delta_i=[X_{(i-1)}, X_{(i)}), X_0=0. $