3.9 Appendix. Assumptions and Remarks

Note that the observations of $ X$ should be centralized before analyzed. Therefore, we assume $ X$ is centralized for ease of exposition. In our proofs, we need the following conditions. (In all our theorems, weaker conditions can be adopted at the expense of much lengthier proofs.)

(C1)
$ \{ (X_i, y_i)\} $ is a stationary (with the same distribution as $ (X, y ) $) and absolutely regular sequence, i.e.
$\displaystyle \beta(k) = \sup_{i\ge 1} E\Big\{ \sup_{{\cal F}_{i+k}^\infty }
\v...
...\vert{\cal F}_1^i) - P(A)\vert\Big\} \to 0 \quad \textrm{as}\quad
k \to \infty,$      

where $ {\cal F}_i^k $ denotes the $ \sigma$-field generated by $ \{ (X_\ell, y_\ell): i\le \ell \le k \} $. Further, $ \beta(k)
$ decreases at a geometric rate.
(C2)
$ \textrm{E}\vert y\vert^k < \infty $ for all $ k > 0 $.
(C2$ '$)
$ \textrm{E}\vert X\Vert^k < \infty $ for all $ k > 0 $.
(C3)
The density function $ f$ of $ X$ has bounded fourth derivative and is bounded away from 0 in a neighbor $ {{\cal D}}$ around 0.
(C3$ '$)
The density function $ f_y $ of $ y$ has bounded derivative and is bounded away from 0 on a compact support.
(C4)
The conditional densities $ f_{X\vert y}(x\vert y) $ of $ X$ given $ y$ and $ f_{(X_0, X_l)\vert(y_0, y_l)} $ of $ (X_0, X_l) $ given $ (y_0, y_l) $ are bounded for all $ l \ge 1 $.
(C5)
$ {\sl g}$ has bounded, continuous (r+2)th derivatives.
(C5$ '$)
$ \textrm{E}(X\vert y) $ and $ \textrm{E}(XX^{\top }\vert y) $ have bounded, continuous third derivatives.
(C6)
$ K(\cdot) $ is a spherical symmetric density function with a bounded derivative. All the moments of $ K(\cdot) $ exist.

(C1) is made only for the purpose of simplicity of proof. It can be weakened to $ \beta(k) = O(k^{-\iota}) $ for some $ \iota > 0
$. Many time series models, including the autoregression single-index model (Xia and Li; 1999), satisfy assumption (C1). Assumption (C2) is also made for simplicity of proof. See, for example, Härdle, Hall, and Ichimura (1993). The existence of finite moments is sufficient. (C3) is needed for the uniform consistency of the kernel smoothing methods. Assumption (C4) is needed for kernel estimation of dependent data. Assumption (C5) is made to meet the continuous requirement for kernel smoothing. The kernel assumption (C6) is satisfied by most of the commonly used kernel functions. For ease of exposition, we further assume $ \int
UU^{\top }K(U) dU = I $.