Next: 10.4 A Binary Response
Up: 10. Semiparametric Models
Previous: 10.2 Semiparametric Models for
10.3 The Proportional Hazards Model
with Unobserved Heterogeneity
Let
denote a duration such as that of a spell of employment or
unemployment.
Let
where
is a vector of covariates. Let
denote
the corresponding conditional probability density function. The
conditional hazard function is defined as
This section is concerned with an approach to modeling
that is based on the proportional hazards model of Cox
(1972).
The proportional hazards model is widely used for the analysis of
duration data. Its form is
![$\displaystyle \lambda (t\vert x)=\lambda _0 (t)\mathrm{e}^{-{x}'\beta }\;,$](img5994.gif) |
(10.16) |
where
is a vector of constant parameters that is conformable
with
and
is a non-negative function that is called
the baseline hazard function. The essential characteristic of
(10.16) that distinguishes it from other models is that
is the product of a function of
alone and
a function of
alone. Cox (1972) developed a partial likelihood
estimator of
and a nonparametric estimator of
. Tsiatis (1981) derived the asymptotic properties of these
estimators.
In the proportional hazards model with unobserved heterogeneity, the
hazard function is conditioned on the covariates
and an unobserved
random variable
that is assumed to be independent of
. The form
of the model is
![$\displaystyle \lambda (t\vert x,u)=\lambda _0 (t)\mathrm{e}^{-({\beta }'x+u)}\;,$](img5996.gif) |
(10.17) |
where
is the hazard conditional on
and
. In a model of the duration of employment
might represent unobserved attributes of an individual (possibly
ability) that affect employment duration. A variety of estimators
of
and
have been proposed under the assumption
that
or the distribution of
or both are known up
to a finite-dimensional parameter. See, for example, Lancaster
(1979), Heckman and Singer (1984a), Meyer (1990), Nielsen,
et al. (1992), and Murphy (1994, 1995).
However,
and the
distribution of
are nonparametrically identified (Elbers and
Ridder 1982, Heckman and Singer 1984b), which suggests that they
can be estimated nonparametrically.
Horowitz (1999) describes a nonparametric estimator of
and the density of
in model (10.2). His estimator is
based on expressing (10.2) as a type of transformation
model. To do this, define the integrated baseline hazard funtion,
by
Then it is not difficult to show that (10.2) is equivalent
to the transformation model
![$\displaystyle \log \Lambda _0 (T)={X}'\beta +U+\varepsilon \;,$](img6001.gif) |
(10.18) |
where
is a random variable that is independent of
and
and has the CDF
. Now define
, where
is the first
component of
and is assumed to be non-zero. Then
and
can be estimated by
using the methods of Sect. 10.2.4. Denote the
resulting estimators of
and
by
and
. If
were known, then
and
could
be estimated by
and
. The baseline hazard function
could be
estimated by differentiating
. Thus, it is necessary
only to find an estimator of the scale parameter
.
To do this, define
, and let
denote
the CDF of
conditional on
. It can be shown that
where
is the CDF of
. Let
denote the probability density
function of
. Define
and
Then it can be shown using l'Hospital's rule that if
for all
, then
To estimate
, let
,
and
be kernel
estimators of
,
and
, respectively, that are based on
a simple random sample of
. Define
Let
,
, and
be constants satisfying
,
, and
. Let
and
be sequences of positive numbers such that
and
. Then
is estimated consistently by
Horowitz (1999) gives conditions under which
is asymptotically normally distributed with a mean of
zero. By choosing
to be close to
, the rate of convergence in
probability of
to
can be made arbitrarily close
to
, which is the fastest possible rate (Ishwaran 1996). It
follows from an application of the delta method that the estimators of
,
, and
that are given by
,
, and
are also asymptotically
normally distributed with means of zero and
rates of
convergence. The probability density function of
can be estimated
consistently by solving the deconvolution problem
, where
. Because the
distribution of
is ''supersmooth,'' the resulting
rate of convergence of the estimator of the density of
is
, where
is the number of times that the density is
differentiable. This is the fastest possible rate. Horowitz (1999)
also shows how to obtain data-based values for
and
and extends the estimation method to models with censoring.
If panel data on
are available, then
can be
estimated with a
rate of convergence, and the assumption of
independence of
from
can be dropped. Suppose that each
individual in a random sample of individuals is observed for exactly
two spells. Let
denote the values of
in
the two spells. Define
. Then the joint survivor
function of
and
conditional on
and
is
Honoré(1993) showed that
Adopt the scale normalization
where
is a non-negative weight function and
is its
support. Then
Now for a weight function
with support
, define
Then,
![$\displaystyle \lambda _0 (t)=\int\limits_{S_T } {{\text{d}}\tau } \int\limits_{...
...,z_2 \right)\exp \left(z_2 -z_1 \right)R\left(t,\tau \vert z_1 ,z_2 \right)}\;.$](img6055.gif) |
(10.19) |
The baseline hazard function can now be estimated by replacing
with an estimator,
, in (10.19). This can be done by
replacing
with
, where
is a consistent estimator
of
such as a marginal likelihood estimator (Chamberlain 1985,
Kalbfleisch and Prentice 1980, Lancaster 2000, Ridder and Tunali
1999), and replacing
with a kernel estimator of the joint survivor
function conditional
and
. The
resulting estimator of
is
The integrated baseline hazard function is estimated by
Horowitz and Lee (2004) give conditions under which
converges weakly to a tight, mean-zero Gaussian
process. The estimated baseline hazard function
converges at the rate
, where
is the number of
times that
is continuously differentiable. Horowitz and
Lee (2004) also show how to estimate a censored version of the model.
Next: 10.4 A Binary Response
Up: 10. Semiparametric Models
Previous: 10.2 Semiparametric Models for