5. Choosing the smoothing parameter

                Tous les résultats asymptotiques que nous venons de considérer ne permettent pas de répondre à l'importante question que posent les praticiens de la Statistique: pour $n$ fixé, comment choisir $h_n$?

Collomb (1981, p. 82)

The problem of deciding how much to smooth is of great importance in nonparametric regression. Before embarking on technical solutions of the problem it is worth noting that a selection of the smoothing parameter is always related to a certain interpretation of the smooth. If the purpose of smoothing is to increase the ``signal to noise ratio" for presentation, or to suggest a simple (parametric) models, then a slightly ``oversmoothed" curve with a subjectively chosen smoothing parameter might be desirable. On the other hand, when the interest is purely in estimating the regression curve itself with an emphasis on local structures then a slightly ``undersmoothed" curve may be appropriate.

However, a good automatically selected parameter is always a useful starting (view)point. An advantage of automatic selection of the bandwidth for kernel smoothers is that comparison between laboratories can be made on the basis of a standardized method. A further advantage of an automatic method lies in the application of additive models for investigation of high-dimensional regression data. For complex iterative procedures such as projection pursuit regression (Friedman and Stuetzle 1981) or ACE (Breiman and Friedman 1985) it is vital to have a good choice of smoothing parameter for one-dimensional smoothers that are elementary building blocks for these procedures.

In the following sections various methods for choosing the smoothing parameters are presented. The choice is made so that some global error criterion is minimized. Section 5.2 discusses how far away the automatically chosen smoothing parameters are from their optimum. It will be seen that there is indeed room for subjective choice of the bandwidth within a slowly narrowing confidence interval of the optimum. Several possibilities for adapting the smoothing parameter to local curvature of the regression curve are presented in Section 5.3. In particular, I propose a method based on bootstrapping from estimated residuals. The supersmoother, proposed by Friedman (1984), is also presented there. The important practical question of how to compare automatically chosen bandwidths between laboratories is discussed in Section 5.4.