Next: 10.4 A Binary Response
Up: 10. Semiparametric Models
Previous: 10.2 Semiparametric Models for
10.3 The Proportional Hazards Model
with Unobserved Heterogeneity
Let denote a duration such as that of a spell of employment or
unemployment.
Let
where is a vector of covariates. Let
denote
the corresponding conditional probability density function. The
conditional hazard function is defined as
This section is concerned with an approach to modeling
that is based on the proportional hazards model of Cox
(1972).
The proportional hazards model is widely used for the analysis of
duration data. Its form is
|
(10.16) |
where is a vector of constant parameters that is conformable
with and
is a non-negative function that is called
the baseline hazard function. The essential characteristic of
(10.16) that distinguishes it from other models is that
is the product of a function of alone and
a function of alone. Cox (1972) developed a partial likelihood
estimator of and a nonparametric estimator of
. Tsiatis (1981) derived the asymptotic properties of these
estimators.
In the proportional hazards model with unobserved heterogeneity, the
hazard function is conditioned on the covariates and an unobserved
random variable that is assumed to be independent of . The form
of the model is
|
(10.17) |
where
is the hazard conditional on
and . In a model of the duration of employment
might represent unobserved attributes of an individual (possibly
ability) that affect employment duration. A variety of estimators
of and have been proposed under the assumption
that
or the distribution of or both are known up
to a finite-dimensional parameter. See, for example, Lancaster
(1979), Heckman and Singer (1984a), Meyer (1990), Nielsen,
et al. (1992), and Murphy (1994, 1995).
However,
and the
distribution of are nonparametrically identified (Elbers and
Ridder 1982, Heckman and Singer 1984b), which suggests that they
can be estimated nonparametrically.
Horowitz (1999) describes a nonparametric estimator of
and the density of in model (10.2). His estimator is
based on expressing (10.2) as a type of transformation
model. To do this, define the integrated baseline hazard funtion,
by
Then it is not difficult to show that (10.2) is equivalent
to the transformation model
|
(10.18) |
where
is a random variable that is independent of
and and has the CDF
. Now define
, where is the first
component of and is assumed to be non-zero. Then
and
can be estimated by
using the methods of Sect. 10.2.4. Denote the
resulting estimators of
and by and
. If were known, then and
could
be estimated by
and
. The baseline hazard function
could be
estimated by differentiating
. Thus, it is necessary
only to find an estimator of the scale parameter .
To do this, define
, and let
denote
the CDF of conditional on . It can be shown that
where is the CDF of . Let denote the probability density
function of . Define
and
Then it can be shown using l'Hospital's rule that if
for all , then
To estimate , let , and be kernel
estimators of , and , respectively, that are based on
a simple random sample of . Define
Let , , and be constants satisfying
,
, and
. Let
and
be sequences of positive numbers such that
and
. Then
is estimated consistently by
Horowitz (1999) gives conditions under which
is asymptotically normally distributed with a mean of
zero. By choosing to be close to , the rate of convergence in
probability of to can be made arbitrarily close
to , which is the fastest possible rate (Ishwaran 1996). It
follows from an application of the delta method that the estimators of
,
, and
that are given by
,
, and
are also asymptotically
normally distributed with means of zero and
rates of
convergence. The probability density function of can be estimated
consistently by solving the deconvolution problem
, where
. Because the
distribution of
is ''supersmooth,'' the resulting
rate of convergence of the estimator of the density of is
, where is the number of times that the density is
differentiable. This is the fastest possible rate. Horowitz (1999)
also shows how to obtain data-based values for and
and extends the estimation method to models with censoring.
If panel data on are available, then
can be
estimated with a rate of convergence, and the assumption of
independence of from can be dropped. Suppose that each
individual in a random sample of individuals is observed for exactly
two spells. Let
denote the values of in
the two spells. Define
. Then the joint survivor
function of and conditional on and is
Honoré(1993) showed that
Adopt the scale normalization
where is a non-negative weight function and is its
support. Then
Now for a weight function with support , define
Then,
|
(10.19) |
The baseline hazard function can now be estimated by replacing
with an estimator, , in (10.19). This can be done by
replacing with , where is a consistent estimator
of such as a marginal likelihood estimator (Chamberlain 1985,
Kalbfleisch and Prentice 1980, Lancaster 2000, Ridder and Tunali
1999), and replacing with a kernel estimator of the joint survivor
function conditional
and
. The
resulting estimator of
is
The integrated baseline hazard function is estimated by
Horowitz and Lee (2004) give conditions under which
converges weakly to a tight, mean-zero Gaussian
process. The estimated baseline hazard function
converges at the rate
, where is the number of
times that
is continuously differentiable. Horowitz and
Lee (2004) also show how to estimate a censored version of the model.
Next: 10.4 A Binary Response
Up: 10. Semiparametric Models
Previous: 10.2 Semiparametric Models for