2.3 Dependence of the Histogram on the Origin

In Section 2.1.3 we have already pointed out that the binwidth is not the only parameter that governs shape and appearance of the histogram. Look at Figure 2.5 where four histograms for the stock returns data have been plotted. We have used the same binwidth $ h=0.04$ for each histogram but varied the origin $ x_0$ of the bin grid.

Figure: Four histograms for the stock returns data corresponding to different origins; binwidth $ h=0.04$
\includegraphics[width=0.03\defepswidth]{quantlet.ps}SPMhisdiffori
\includegraphics[width=1.4\defpicwidth]{SPMhisdiffori.ps}

Even though we use the same data and the same binwidth, the histograms give quite different accounts of some of the key features of the data: whereas all histograms indicate that the true pdf is unimodal, only the upper right histogram suggests a symmetrical pdf. Also, note that the estimates of $ f(0)$ differ considerably.

This property of histograms strongly conflicts with the goal of nonparametric statistics to ``let the data speak for themselves". Obviously, the same data speak quite differently out of the different histograms. How can we get rid of the dependency of the histogram on the choice of the origin of the bin grid? A natural remedy might be to compute histograms using the same binwidth but different origins and to average over the different histograms. We will consider this technique in the next section.