12.6 Goodness-of-Fit Statistic

To extend the empirical likelihood ratio statistic to a global measure of Goodness-of-Fit, we choose $ k_n$-equally spaced lattice points $ t_1, t_2, \cdots,t_{k_n}$ in $ [0,1]$ where $ t_1=0$, $ t_{k_n}=1$ and $ t_i \le t_j$ for $ 1\le i < j \le k_n$. We let $ k_n \to \infty$ and $ k_n/n \to 0$ as $ n\to \infty$. This essentially divides $ [0,1]$ into $ k_n$ small bins of size $ (k_n)^{-1}$. A simple choice is to let $ k_n = [1/(2h)]$ where $ [a]$ is the largest integer less than $ a$. This choice as justified later ensures asymptotic independence among $ \ell\{\tilde{m}_{\hat{\theta}}(t_j)\}$ at different $ t_j$s. Bins of different size can be adopted to suit situations where there are areas of low design density. This corresponds to the use of different bandwidth values in adaptive kernel smoothing. The main results of this chapter is not affected by un-equal bins. For the purpose of easy presentation, we will treat bins of equal size.

As $ \ell\{\tilde{m}_{\hat{\theta}}(t_j)\}$ measures the Goodness-of-Fit at a fixed $ t_j$, an empirical likelihood based statistic that measures the global Goodness-of-Fit is defined as

$\displaystyle \ell_n(\tilde{m}_{\hat{\theta}}) \stackrel{\mathrm{def}}{=}\sum^{k_n}_{j=1} \ell\{\tilde{m}_{\hat{\theta}}(t_j)\}.$

The following theorem was proven by Chen et al. (2001).

THEOREM 12.2   Under the assumptions (i) - (vi),

$\displaystyle k_n^{-1}\ell_n(\tilde{m}_{\hat{\theta}}) =(nh) \int { \lbrace \ha...
...\rbrace^2 \over V(x)} dx + {\mathcal{O}}_p\{k_n^{-1} \log^2(n) + h \log^2(n) \}$ (12.20)

where $ V(x) \stackrel{\mathrm{def}}{=}\lim_{h \rightarrow 0} V(x,h)$.

Härdle and Mammen (1993) proposed the $ L_2$ distance

$\displaystyle T_n =n h^{1/2} \int \lbrace \hat{m}(x) - \tilde{m}_{\hat{\theta}}(x)\rbrace^2 \pi(x) dx$

as a measure of Goodness-of-Fit where $ \pi(x)$ is a given weight function. Theorem 12.2 indicates that the leading term of $ k_n^{-1}\ell_n(\tilde{m}_{\hat{\theta}})$ is $ h^{1/2} T_n$ with $ \pi(x)=V^{-1}(x)$. The differences between the two test statistics are (a) the empirical likelihood test statistic automatically studentizes via its internal algorithm conducted at the background, so that there is no need to explicitly estimate $ V(x)$; (b) the empirical likelihood statistic is able to capture other features such as skewness and kurtosis exhibited in the data without using the bootstrap resampling which involves more technical details when data are dependent. If we choose $ k_n = [1/(2h)]$ as prescribed, then the remainder term in (12.21) becomes $ {\mathcal{O}}_p\{h\log^2(n)\}$.

We will now discuss the asymptotic distribution of the test statistic $ \ell_n(\tilde{m}_{\hat{\theta}})$. Theorem 12.3 was proven by Chen et al. (2001).

THEOREM 12.3   Suppose assumptions (i) - (vi), then

$\displaystyle k_n^{-1} \ell_n(\tilde{m}_{\hat{\theta}})
\stackrel{ {\cal{L}} }\to \int_0^1 {\cal{N}}^2(s) ds$

where $ {\cal{N}}$ is a Gaussian process on $ [0,1]$ with mean

$\displaystyle \textrm{E}\{{\cal{N}}(s)\}= h^{1/4} \Delta_n(s)/\sqrt{V(s)}$

and covariance

$\displaystyle \Omega(s,t) = \mathop{\hbox{Cov}}\{ {\cal{N}}(s),{\cal{N}}(t)\} =...
...gma^2(t)}
{ W_{0}^{(2)}(s,t) \over \sqrt{W_{0}^{(2)}(s,s) W_{0}^{(2)}(t,t)} } $

where
$\displaystyle W_{0}^{(2)}(s,t) = \int_0^1 h^{-1} K\{ (s-y)/h\} K\{ (t-y)/h\} dy.$     (12.21)

As $ K$ is a compact kernel on $ [-1,1]$, when both $ s$ and $ t$ are in $ S_I$ (the interior part of $ [0,1]$), we get from (12.22) with $ u = (s-y)/h$

$\displaystyle W_{0}^{(2)}(s,t)$ $\displaystyle =$ $\displaystyle \int_{\frac{s-1}{h}}^{\frac{s}{h}} K(u) K\{u-(s-t)/h\} du$  
  $\displaystyle =$ $\displaystyle \int_{-\infty}^{\infty}
K(u) K\{u-(s-t)/h\} du$  
  $\displaystyle =$ $\displaystyle K^{(2)}\left({s-t\over h}\right)$ (12.22)

where $ K^{(2)}$ is the convolution of $ K$. The compactness of $ K$ also means that $ W_{0}^{(2)}(s,t) =0$ if $ \vert s-t\vert > 2h$ which implies $ \Omega(s,t) = 0$ if $ \vert s-t\vert > 2h$. Hence $ {\cal{N}}(s)$ and $ {\cal{N}}(t)$ are independent if $ \vert s-t\vert > 2h$. As

$\displaystyle f(s) \sigma^2(s) = f(s)\sigma^2(t) + {\mathcal{O}}(h)$

when $ \vert s-t\vert\le 2 h$, we get

$\displaystyle \Omega(s,t) = { W_{0}^{(2)}(s,t) \over \sqrt{W_{0}^{(2)}(s,s) W_{0}^{(2)}(t,t)} } + {\mathcal{O}}(h),$ (12.23)

So, the leading order of the covariance function is free of $ \sigma^2$ and $ f$, i.e. $ \Omega(s,t)$ is completely known.

Let

$\displaystyle {\cal{N}}_0(s) = {\cal{N}}(s) - \frac{h^{1/4} \Delta_n(s)}{\sqrt{V(s)}}.$ (12.24)

Then $ {\cal{N}}_0(s)$ is a normal process with zero mean and covariance $ \Omega$. The boundedness of $ K$ implies $ W_0^{(2)}$ being bounded, and hence $ \int_0^1 \Omega(t,t)dt < \infty$. We will now study the expectation and variance of $ \int_0^1 {\cal{N}}^2(s) ds$. Let $ T = T_1 + T_2 + T_3 \stackrel{\mathrm{def}}{=}\int_0^1 {\cal{N}}^2(s) ds$ where
$\displaystyle T_1$ $\displaystyle =$ $\displaystyle \int_0^1 {\cal{N}}_0^2(s) ds,$  
$\displaystyle T_2$ $\displaystyle =$ $\displaystyle 2 h^{1/4} \int_0^1 V^{-1/2}(s) \Delta_n(s) {\cal{N}}_0(s) ds \quad \hbox{and}$  
$\displaystyle T_3$ $\displaystyle =$ $\displaystyle h^{1/2} \int_0^1 V^{-1}(s) \Delta_n^2(s) ds.$  

From some basic results on stochastic integrals, Lemma 12.2 and (12.24) follows,
$\displaystyle \textrm{E}(T_1)$ $\displaystyle =$ $\displaystyle \int_0^1 \Omega(s,s) ds = 1 \quad \hbox{and}$  
$\displaystyle \textrm{Var}(T_1)$ $\displaystyle =$ $\displaystyle \textrm{E}[T_1^2] - 1$ (12.25)
  $\displaystyle =$ $\displaystyle \int_0^1 \int_0^1 \textrm{E}\left[N_0^2(s) N_0^2(t) \right] ds dt - 1$ (12.26)
  $\displaystyle =$ $\displaystyle 2 \int_0^1 \int_0^1 \Omega^2(s,t) ds dt$  
  $\displaystyle =$ $\displaystyle 2 \int_0^1 \int_0^1 \lbrace W_{0}^{(2)}(s,t)\rbrace^2 \{ W_{0}^{(2)}(s,s) W_{0}^{(2)}(t,t)\}^{-1} ds dt \, \{ 1 + {\mathcal{O}}(h^2)\}$  

From (12.23) and the fact that the size of the region $ [0,1] \setminus S_{I,h}$ is $ {\mathcal{O}}(h)$, we have
    $\displaystyle \int_0^1 \int_0^1 \lbrace W_{0}^{(2)}(s,t)\rbrace^2 \{ W_{0}^{(2)}(s,s) W_{0}^{(2)}(t,t)\}^{-1} ds dt$  
  $\displaystyle =$ $\displaystyle \{K^{(2)}(0)\}^{-2} \int_0^1 \int_0^1 [ K^{(2)}\{(s-t)/h\}]^2 ds dt
\, \{ 1 + {\scriptstyle \mathcal{O}}(1)\}$  
  $\displaystyle =$ $\displaystyle h K^{(4)}(0) \{K^{(2)}(0)\}^{-2} + {\scriptstyle \mathcal{O}}(h).$  

Therefore,

$\displaystyle \textrm{Var}(T_1) = 2 h K^{(4)}(0) \{K^{(2)}(0)\}^{-2} + {\scriptstyle \mathcal{O}}(h^{2}).$

It is obvious that $ \textrm{E}(T_2) =0$ and
$\displaystyle \textrm{Var}(T_2)$ $\displaystyle =$ $\displaystyle 4 h^{1/2} \int \int V^{-1/2}(s) \Delta_n(s) \Omega(s,t) V^{-1/2}(t)\Delta_n(t) ds dt.$  

As $ \Delta_n$ and $ V^{-1}$ are bounded in $ [0,1]$, there exists a constant $ C_1$ such that

$\displaystyle \textrm{Var}(T_2) \le C_1 h^{1/2} \int \int \Omega(s,t) ds dt.$

Furthermore we know from the discussion above,
$\displaystyle \int \int \Omega(s,t) ds dt$ $\displaystyle =$ $\displaystyle \int \int \frac{ W_{0}^{(2)}(s,t)}{ \sqrt{W_{0}^{(2)}(s,s) W_{0}^{(2)}(t,t)}} ds dt + {\mathcal{O}}(h)$  
  $\displaystyle =$ $\displaystyle \int \int_{t-2h}^{t+2h} \frac{ W_{0}^{(2)}(s,t)}{ K^{(2)}(0) } ds dt + {\mathcal{O}}(h)$  
  $\displaystyle \le$ $\displaystyle 4 \frac{1}{K^{(2)}(0)} C_1' h + C_1'' h$  

with other constants $ {C'}_1$ and $ C_1''$, and thus, there exists a constant $ C_2$, such that

$\displaystyle \textrm{Var}(T_2) \le C_2 h^{\frac{3}{2}}. $

As $ T_3$ is non-random, we have

$\displaystyle \textrm{E}(T)$ $\displaystyle =$ $\displaystyle 1 + h^{1/2} \int_0^1 V^{-1}(s) \Delta_n^2(s) ds \quad \hbox{and}$ (12.27)
$\displaystyle \textrm{Var}\{ T \}$ $\displaystyle =$ $\displaystyle 2 h K^{(4)}(0) \{K^{(2)}(0)\}^{-2} + {\scriptstyle \mathcal{O}}(h)$ (12.28)

(12.28) and (12.29) together with Theorem 12.3 give the asymptotic expectation and variance of the test statistic $ k_n^{-1}\ell_n(\tilde{m}_{\hat{\theta}})$.