Next: References
Up: 10. Semiparametric Models
Previous: 10.3 The Proportional Hazards
10.4 A Binary Response Model
The general binary response model has the form
|
(10.20) |
where is an unobserved random variable. If the distribution of
is unknown but depends on only through the index
,
then (10.20) is a single-index model, and can be
estimated by the methods described in
Sect. 10.2.1. An alternative model that is non-nested
with single-index models can be obtained by assuming that
median for all . This assumption places
only weak restrictions on the relation between and the
distribution of . Among other things, it accommodates fairly
general types of heteroskedasticity of unknown form, including random
coefficients. Under median centering, the inferential problem is to
estimate . The response function,
is not
identified without making assumptions about the distribution of
that are stronger than those needed to identify and
estimate . Without such assumptions, the only restriction on
under median centering is that
Manski (1975, 1985) proposed the first estimator of under
median centering. Let the data be the simple random sample
. The estimator is called the maximum score
estimator and is
|
(10.21) |
where
denotes the Euclidean norm of the vector
. The restriction
is a scale
normalization. Scale normalization is needed for identification
because (10.20) identifies only up to
scale. Manski (1975, 1985) gave conditions under which
consistently estimates . The rate of convergence of and
its asymptotic distribution were derived by Cavanagh (1987) and Kim
and Pollard (1990). They showed that the rate of convergence in
probability of to is and that
converges in distribution to the maximum of a complicated
multidimensional stochastic process. The complexity of the limiting
distribution of the maximum score estimator limits its usefulness for
statistical inference. Delgado, Rodríguez-Póo and Wolf (2001)
proposed using subsampling methods to form confidence intervals
for .
The maximum score estimator has a slow rate of convergence and
a complicated asymptotic distribution because it is obtained by
maximizing a step function. Horowitz (1992) proposed replacing the
indicator function in (10.21) by a smooth function. The
resulting estimator of is called the smoothed maximum
score estimator.
Specifically, let be a smooth function, possibly
but not necessarily a distribution function, that satisfies
and
. Let
be a sequence
of strictly positive constants (bandwidths) that satisfies
as
. The smoothed maximum score estimator, , is
where is a compact parameter set that satisfies the scale
normalization
. Horowitz (1992) shows that under
assumptions that are stronger than those of Manski (1975, 1985) but
still quite weak,
is asymptotically normal,
where
and the exact value of depends on the
smoothness of the distribution of and of
. Moreover, the smoothed maximum score estimator has the fastest
possible rate of convergence under its assumptions (Horowitz
1993b). Monte Carlo evidence suggests that the asymptotic normal
approximation can be inaccurate with samples of practical
size. However, Horowitz (2002) shows that the bootstrap, which is
implemented by sampling the data randomly with replacement, provides
asymptotic refinements for tests of hypotheses about and
produces low ERPs for these tests. Thus, the bootstrap provides
a practical way to carry out inference with the smoothed maximum score
estimator.
Horowitz (1993c) used the smoothed
maximum score method to estimate the parameters of a model of the
choice between automobile and transit for work trips in the
Washington, D.C., area. The explanatory variables are defined in
Table 10.2. Scale normalization is achieved by setting
the coefficient of DCOST equal to . The data consist of 842
observations sampled randomly from the Washington, D.C., area
transportation study. Each record contains information about a single
trip to work, including the chosen mode (automobile or transit) and
the values of the explanatory variables. Column 2 of
Table 10.2 shows the smoothed maximum score estimates of
the model's parameters. Column 3 shows the half-widths of nominal
symmetrical confidence intervals based on the asymptotic
normal approximation (half width equals times the standard
error of the estimate). Column 4 shows half-widths obtained from the
bootstrap. The bootstrap confidence intervals are - times
wider than the intervals based on the asymptotic normal
approximation. The bootstrap confidence interval for the coefficient
of DOVTT contains 0, but the confidence interval based on the
asymptotic normal approximation does not. Therefore, the hypothesis
that the coefficient of DOVTT is zero is not rejected at the
level based on the bootstrap but is rejected based on the asymptotic
normal approximation.
Acknowledgements. Research supported in part by NSF Grant SES-9910925.
Next: References
Up: 10. Semiparametric Models
Previous: 10.3 The Proportional Hazards