9.3 Estimation of the Historical SPD

While the previous section was dedicated to finding a proxy for $ f^*$ used by investors to price options, this section approximates the historical underlyings' density $ g^*$ for date $ t=T$ using all the information available at date $ t=0$. Of course, if the process governing the underlying asset dynamics were common knowledge and if agents had perfect foresight, then by no arbitrage arguments both SPDs should be equal. Following Ait-Sahalia, Wang and Yared (2000), the density extracted from the observed underlyings' data is not comparable to the density implied by observed option data without assumptions on investor's preferences. As in Härdle and Tsybakov (1995), they apply an estimation method which uses the observed asset prices to infer indirectly the time series SPD. First, we will explain the estimation method for the underlyings' SPD. Second, we apply it to DAX data.


9.3.1 The Estimation Method

Assuming the underlying $ S$ to follow an Îto diffusion process driven by a Brownian motion $ W$:

$\displaystyle dS_t$ $\displaystyle =$ $\displaystyle \mu (S_t)dt + \sigma (S_t) dW_t.$ (9.1)

Ait-Sahalia, Wang and Yared (2000) rely on Girsanov's characterization of the change of measure from the actual density to the SPD. It says the diffusion function of the asset's dynamics is identical under both the risk neutral and the actual measure and only the drift function needs to be adjusted, leading to the risk neutral asset dynamics:
$\displaystyle dS_t^*$ $\displaystyle =$ $\displaystyle (r_{t,\tau}-\delta_{t,\tau})S_t^*dt + \sigma (S_t^*)dW_t^*.$ (9.2)

Let $ g_t(S_t,S_T,\tau,r_{t,\tau},\delta_{t,\tau})$ denote the conditional density of $ S_T$ given $ S_t$ generated by the dynamics defined in equation (9.1) and $ g_t^*(S_t,S_T,\tau,r_{t,\tau},\delta_{t,\tau})$ denote the conditional density generated by equation (9.2) then $ f^*$ can only be compared to the risk neutral density $ g^*$ and not to $ g$.

A crucial feature of this method is that the diffusion functions are identical under both the actual and the risk neutral dynamics (which follows from Girsanov's theorem). Therefore, it is not necessary to observe the risk neutral path of the DAX index $ \{S_t^*\}$. The function $ \sigma(\bullet)$ is estimated using $ N^{*}$ observed index values $ \{S_t\}$ and applying Florens-Zmirou's (1993) (FZ) nonparametric version of the minimum contrast estimators:

$\displaystyle \hat{\sigma}_{FZ}(S)$ $\displaystyle =$ $\displaystyle \frac{\sum_{i=1}^{N^{*}-1}
K_{FZ}(\frac{S_i-S}{h_{FZ}})N^{*}\{S_{(i+1)/N^{*}} - S_{i/N^{*}}
\}^2}{\sum_{i=1}^{N^{*}}K_{FZ}(\frac{S_i-S}{h_{FZ}})},$ (9.3)

where $ K_{FZ}(\bullet)$ is a kernel function and $ h_{FZ}$ is a bandwidth parameter such that:
$ (N^*h_{FZ})^{-1} ln(N^*)$ $ \rightarrow$ 0 and $ N^*h_{FZ}^4$ $ \rightarrow$ 0
<>
as $ N^* \rightarrow \infty$. Without imposing restrictions on the drift function $ \hat{\sigma}_{FZ}$ is an unbiased estimator of $ \sigma$ in the model specified in equation (9.2). Since the DAX index is a performance index ( $ \delta_{t,\tau}=0$), the risk neutral drift rate of equation (9.2) is equal to $ r_{t,\tau}$.

Once $ \sigma(\bullet)$ is estimated, the time series SPD $ g^*$ can be computed by Monte Carlo integration. Applying the Milstein scheme (Kloeden, Platen and Schurz (1994)), we simulate $ M=10,000$ paths of the diffusion process:

$\displaystyle dS_t^*$ $\displaystyle =$ $\displaystyle r_{t,\tau}S_t^*dt + \hat{\sigma}_{FZ}(S_t^*)dW_t^*$ (9.4)

for a time period of $ 3$ months, starting value $ S_{t=0}$ equal to the DAX index value at the beginning of the period, collect the endpoints at T of these simulated paths $ \{S_{T,m}:m=1,\hdots,M\}$ and annualize the index log-returns. Then $ g^*$ is obtained by means of a nonparametric kernel density estimation of the continuously compounded log-returns $ u$:
$\displaystyle \hat{p_t}^*(u)$ $\displaystyle =$ $\displaystyle \frac{1}{Mh_{MC}}\sum_{m=1}^{M}K_{MC}\Big(\frac{u_m-u}{h_{MC}}\Big)$ (9.5)

where $ u_m$ is the log-return at the end of the $ m$th path and $ K_{MC}(\bullet)$ is a kernel function and $ h_{MC}$ is a bandwidth parameter. The equation:

$ \textrm{P}(S_T \leq S)$ $ =$ $ \textrm{P}(u \leq$   log$ (S/S_t))$ $ =$ $ \int_{-\infty}^{\text{log}(S/S_t)} p_t^*(u)du$
<>
with $ u=\ln(S_T/S_t)$ relates this density estimator to the SPD $ g^*$:

$ g_t^*(S)$ $ =$ $ \frac{\partial}{\partial S}\textrm{P}(S_T \leq S)
\nonumber$ $ =$ $ \frac{p_t^*( \text{log}(S/S_t) )}{S}$.
<>
This method results in a nonparametric estimator $ \hat{g}^*$ which is $ \sqrt{N^*}$-consistent as $ M \rightarrow \infty$ even though $ \hat{\sigma}_{FZ}$ converges at a slower rate (Ait-Sahalia, Wang and Yared (2000)).

In the absence of arbitrage, the futures price is the expected future value of the spot price under the risk neutral measure. Therefore the time series distribution is translated such that its mean matches the implied future price. Then the bandwidth $ h_{MC}$ is chosen to best match the variance of the IBT implied distribution. In order to avoid over- or undersmoothing of $ g^*$, $ h_{MC}$ is constrained to be within $ 0.5$ to $ 5$ times the optimal bandwidth implied by Silverman's rule of thumb. This procedure allows us to focus the density comparison on the skewness and kurtosis of the two densities.


9.3.2 Application to DAX Data

Using the DAX index data from M D *BASE we estimate the diffusion function $ \sigma^2(\bullet)$ from equation (9.2) by means of past index prices and simulate (forward) $ M=10,000$ paths to obtain the time series density, $ g^*$.

Figure 9.2: Mean and variance adjusted estimated time series density on Friday, April $ 18$, $ 1997$. Simulated with $ M=10,000$ paths, $ S_0=3328.41$, $ r=3.23$ and $ \tau =88/360$.
\includegraphics[width=1.5\defpicwidth]{TimeSeriesDensityPlotPeriode1PS.ps}

To be more precise, we explain the methodology for the first period in more detail. First, note that Friday, April $ 18$, $ 1997$, is the $ 3$rd Friday of April $ 1997$. Thus, on Monday, April $ 21$, $ 1997$, we use $ 3$ months of DAX index prices from Monday, January $ 20$, $ 1997$, to Friday, April $ 18$, $ 1997$, to estimate $ \sigma^2$. Following, on the same Monday, we start the $ 3$ months `forward' Monte Carlo simulation. The bandwidth $ h_{FZ}$ is determined by Cross Validation applying the XploRe quantlet 19020 regxbwcrit which determines the optimal bandwidth from a range of bandwidths by using the resubstitution estimator with the penalty function 'Generalized Cross Validation'.

Knowing the diffusion function it is now possible to Monte Carlo simulate the index evolution. The Milstein scheme applied to equation (9.2) is given by:

$\displaystyle S_{i/N^{**}}$ $\displaystyle =$ $\displaystyle S_{(i-1)/N^{**}}+r S_{(i-1)/N^{**}}\Delta t +
\sigma(S_{(i-1)/N^{**}})\Delta W_{i/N^{**}} +$  
    $\displaystyle \frac{1}{2} \sigma(S_{(i-1)/N^{**}})\frac{\partial \sigma}{\partial
S}(S_{(i-1)/N^{**}}) \Big( (\Delta W_{(i-1)/N^{**}})^2 - \Delta t \Big),$  

where we set the drift equal to $ r$ which is extracted from M D *BASE and corresponds to the time to maturity used in the simulation and $ N^{**}$ is the number of days to maturity. The first derivative of $ \sigma(^.)$ is approximated by:
$\displaystyle \frac{\partial \sigma}{\partial S} (S_{(i-1)/N^{**}})$ $\displaystyle =$ $\displaystyle \frac{\sigma(S_{(i-1)/N^{**}})
- \sigma(S_{(i-1)/N^{**}} - \Delta S)}{\Delta S},$  

where $ \Delta S$ is $ 1/2$ of the width of the bingrid on which the diffusion function is estimated. Finally the estimated diffusion function is linearly extrapolated at both ends of the bingrid to accommodate potential outliers.

With these ingredients we start the simulation with index value $ S_0=3328.41$ (Monday, April $ 21$, $ 1997$) and time to maturity $ \tau =88/360$ and $ r=3.23$. The expiration date is Friday, July $ 18$, $ 1997$. From these simulated index values we calculate annualized log-returns which we take as input of the nonparametric density estimation (see equation (9.5)). The XploRe quantlet 19039 denxest accomplishes the estimation of the time series density by means of the Gaussian kernel function:

$\displaystyle K(u)$ $\displaystyle =$ $\displaystyle \frac{1}{\sqrt{2\pi}}\exp{\Big(-\frac{1}{2}u^2\Big)}.$  

The bandwidth $ h_{MC}$ is computed by the XploRe quantlet 19046 denrot which applies Silverman's rule of thumb.

First of all, we calculate the optimum bandwidth $ h_{MC}$ given the vector of $ 10,000$ simulated index values. Then we search the bandwidth $ h^$'$ _{MC}$ which implies a variance of $ g^*$ to be closest to the variance of $ f^*$ (but to be still within $ 0.5$ to $ 5$ times $ h_{MC}$). We stop the search if var($ g^*$) is within a range of $ 5\%$ of var($ f^*$). Following, we translate $ g^*$ such that its mean matches the futures price F. Finally, we transform this density over DAX index values $ S_T$ into a density $ g^*$' over log-returns $ u_T$. Since

$ \textrm{P}(S_T < x)$ $ =$ $ \textrm{P}\Big(\ln{\big(\frac{S_T}{S_t}\big)} <
\ln{\big(\frac{x}{S_t}\big)}\Big)$ $ =$ $ \textrm{P}(u_T < u)$
<>
where $ x=S_te^u$, we have


$\displaystyle \textrm{P}(S_T \in [x,x+\Delta x])$ $\displaystyle =$ $\displaystyle \textrm{P}(u_T \in [u,u+\Delta u])$  

and
$\displaystyle \textrm{P}(S_T \in [x,x+\Delta x])$ $\displaystyle \approx$ $\displaystyle g^*(x)\Delta x$  
$\displaystyle \textrm{P}(u_T \in [u,u+\Delta u])$ $\displaystyle \approx$ $\displaystyle g^*$'$\displaystyle (u)\Delta u.$  

Therefore, we have as well (see Härdle and Simar (2002) for density transformation techniques)
$ g^*$'$ (u)$ $ \approx $ $ \frac{g^*(S_te^u)\Delta (S_te^u)}{\Delta u}$ $ \approx $ $ g^*(S_te^u)S_te^u$.
<>
To simplify notations, we will denote both densities $ g^*$. Figure 9.2 displays the resulting time series density over log-returns on Friday, April $ 18$, $ 1997$. Proceeding in the same way for all $ 30$ periods beginning in April $ 1997$ and ending in September $ 1999$, we obtain the time series of the $ 3$ month `forward' skewness and kurtosis values of $ g^*$ shown in Figures 9.3 and 9.4. The figures reveal that the time series distribution is systematically slightly negatively skewed. Skewness is very close to zero. As far as kurtosis is concerned we can extract from Figure 9.4 that it is systematically smaller than but nevertheless very close to $ 3$. Additionally, all time series density plots looked like the one shown in Figure 9.2.