5.3 The Cox Proportional Hazards Model


{ll, ll1, ll2} = 9668 hazregll (data,beta)
calculates the value of the partial log-likelihood function and of the first two derivatives
{betahat, betak, ck} = 9671 hazbeta (data {,maxit})
estimates the regression coefficients for the Cox proportional hazards model
{bhaz, bsurv} = 9674 hazbase (data)
estimates the baseline hazard and survival functions
surv = 9677 hazsurv (data, z)
estimates the conditional survival function
{val, df, pval} = 9680 haztest (data, index)
performs the likelihood ratio test, Wald's test and the score test

The semiparametric Cox proportional hazards model is the most commonly used model in hazard regression. In this model, the conditional hazard function, given the covariate value $ z$, is assumed to be of the form

$\displaystyle \lambda(t\vert z) = \lambda_{0}(t) \exp\{ \beta^Tz\},$

where $ \beta=(\beta_1, \ldots, \beta_p)^T\,$ is the vector of regression coefficients, and $ \lambda_{0}(t)$ denotes the baseline hazard function. No particular shape is assumed for the baseline hazard; it is estimated nonparametrically. The contributions of covariates to the hazard are multiplicative. An accessible introduction to the Cox model is given, for example, in Klein and Moeschberger (1997).

The quantlib hazreg provides quantlets for estimating the regression coefficients, $ \beta$, standard deviations of these estimates, and estimates for the cumulative baseline hazard and the conditional survival function. Our calculations are based on standard partial likelihood methods in the proportional hazards model. Additionally, we provide three commonly used tests for the hypothesis that one or more of the $ \beta$'s are zero. These tests are useful for model choice procedures.


5.3.1 Estimating the Regression Coefficients

Let us assume that there are no ties between the event times. In this case, the partial likelihood function is given by

$\displaystyle L(\beta)=\prod_{i=1}^{n}\left\{\frac{\exp\left( \beta^Tz_i\right)} {\sum_{j\in R(t_i)}\exp\left( \beta^Tz_j\right)}\right\}^{\delta_i},$ (5.4)

where $ R(t_i) = \{j\! :\, t_j \ge t_i\}$ denotes the risk set at time $ t_i$. Note that only event times contribute their own factor to the partial likelihood. However, both censored and uncensored observations appear in the denominator, where the sum over the risk set includes all individuals who are still at risk immediately prior to $ t_i$.

Let $ \hat \beta$ denote the maximum (partial) likelihood estimate of $ \beta$, obtained by maximizing the partial log-likelihood function, $ l(\beta)=\ln L(\beta)$. From (5.4), it follows immediately that

$\displaystyle l(\beta)=\sum_{i=1}^{n}\delta_i\left( \beta^Tz_{i}\right) -\sum_{...
...delta_i\, \ln \left\{ \sum_{j\in R(t_i)}\exp\left( \beta^Tz_{j}\right)\right\}.$ (5.5)

The first derivative of $ l(\beta)$ with respect to $ \beta$ is called vector of efficient scores, given by

$\displaystyle U(\beta)=\frac{dl}{d\beta}= {\mathbf \delta}^TZ - \sum_{i=1}^{n}\...
...}\right) Z_{(j,\,\cdot)}}}{{\sum_{j\in R(t_i)}\exp\left( \beta^Tz_{j}\right)}},$ (5.6)

where $ \delta = (\delta_1, \ldots, \delta_n)^T$ denotes the vector of censoring indicators, and $ Z$ is the $ (n\times p)$-matrix of covariate values, with the $ j$-th row containing the covariate values of the $ j$-th individual, $ Z_{(j,\,\cdot)} = z_j^T$.

For the case of ties in the event times, there are several ways to define a partial likelihood function. Currently, we are using formula (5.4) both for data with and without ties. Each event time contributes one factor to the likelihood function; for tied events, all events in the tie appear with the same denominator. This approach was suggested by Breslow (1974), and is implemented in most statistical packages. When there are few ties, this approximation to the partial likelihood works rather well, see Klein and Moeschberger (1997), p.238.

The information matrix $ \mathbf{I}(\beta)$ is given by the negative of the second derivative of $ l(\beta)$. Let  $ \mathbf{1}_{R(i)} \in I\!\!R^{n}$ denote the indicator vector of the risk set $ R(t_{i})$; this means, the $ j$-th element of $ \mathbf{1}_{R(i)}$ is $ 1$ when $ t_{j} \ge t_{i}$, and 0, otherwise. Then, the information matrix takes the form

$\displaystyle \mathbf{I}(\beta)$ $\displaystyle =$ $\displaystyle - \frac{d^{2}l}{d\beta^{2}}$ (5.7)
  $\displaystyle =$ $\displaystyle \sum_{i=1}^{n} \,\frac{\delta_i}{ w_i(\beta)^{2}} \ \bar Z(i)^T
\...
...g\left\{\exp(Z\beta) \right\} - \exp(Z\beta)
\exp(Z\beta)^T \right] \bar Z(i) ,$  

where the $ w_i(\beta) = {\mathbf 1}_{R(i)}^T\exp\left(Z\beta\right)$ are scalars; for any vector $ \nu$, $ Diag\{\nu\}$ denotes the diagonal matrix with the main diagonal $ \nu$, and $ \exp(\nu)$ is defined elementwise; and $ \bar Z(i) = Diag\left\{{\mathbf 1}_{R(i)}\right\}Z$. The matrices $ \bar Z(i)$ are modifications of the design matrix $ Z$, setting the rows of $ \bar Z(i)$ to zero when the corresponding observation is not in the risk set for time $ t_{i}$. Note that the index $ i$ runs through all $ n$ observations. When ties are present, each of the tied event times appears once, with the same risk set, and contributes the same term to the information matrix.

For large samples, the maximum likelihood estimate $ \hat \beta$ is known to follow an asymptotic $ p$-variate normal distribution,

$\displaystyle {\mathbf I}(\beta)^{1/2} \left\{\hat\beta - \beta \right\}
{\longrightarrow}_{n\rightarrow\infty} \ N(0, I_{p}).$

The inverse of the information matrix, $ {\mathbf I}^{-1}(\hat\beta),\,$ is a consistent estimate of the covariance matrix of $ \hat \beta$. It may be used to construct confidence intervals for the components of $ \beta$.

The quantlet 9876 hazregll computes the partial log-likelihood function, its first derivative (efficient scores), and the negative of the second derivative (information matrix). The first and second derivatives of the log-likelihood function (5.5) are later used to obtain $ \hat \beta$, as well as for computing test statistics for local tests on $ \beta$. The syntax of 9879 hazregll is given below:

   {ll,ll1,ll2} = hazregll(data,beta)

Input:

data
$ n \times (p+4)$ matrix, the sorted data matrix obtained as output data of the quantlet 9882 hazdat ;
beta
$ p \times 1$ vector, the regression coefficient vector $ \beta$.

Output:

ll
scalar, the log-likelihood function at parameter value $ \beta$;
ll1
$ p \times 1$ vector, the first derivatives of the log-likelihood function at parameter value $ \beta$;
ll2
$ p \times p$ matrix, the negative Hessian matrix of the log-likelihood function at parameter value $ \beta$ (information matrix).

Example 5.The simulated data in the file haz01.dat were generated from a proportional hazards model with $ p=2$ covariates, the conditional hazard function $ \lambda(t\vert z) = \exp(\beta^T z) $, and $ \beta = (1, 2)^T$. The baseline hazard is constant, $ \lambda_0(t) = 1$. We use the quantlet 9887 hazregll to calculate the partial log-likelihood function, the efficient scores and the information matrix at the true parameter value, $ \beta = (1, 2)^T$.

  library("hazreg")
  dat=read("haz01.dat")  
  t = dat[,1]                        ; observed times                      
  delta = dat[,2]                    ; censoring indicator                       
  z = dat[,3:4]                      ; covariates                         
  {data,ties} = hazdat(t,delta, z)   ; preparing data
  beta = 1|2    
  {ll,ll1,ll2} = hazregll(data,beta)
9891 haz05.xpl

The calculations yield ll = -34.679 for the value of the log-likelihood function, $ \texttt{ll1} = (0.014323, 0.88238)^T$ for the first derivatives, and

$\displaystyle \textrm{\tt ll2} =
\left(\begin{array}{cc}
1.3696 & -0.43704\\
-0.43704 & 0.8285
\end{array}\right)$

for the information matrix.


The quantlet 9897 hazbeta calculates the maximum likelihood estimate $ \hat \beta$ by solving the nonlinear equation system $ U(\beta)= 0$, defined in (5.6). We use a Newton-Raphson algorithm with the stopping criterion

$\displaystyle C(\hat\beta_{s})=\frac{\vert\hat\beta_{s}-\hat\beta_{s-1}\vert} {\vert l(\hat\beta_{s-1})\vert}.
$

The syntax of 9900 hazbeta is given below:

  {betahat, betak, ck} = hazbeta(data {,maxit})

Input:

data
$ n \times (p+4)$ matrix, the sorted data matrix obtained as output data of 9903 hazdat ;
maxit
scalar, maximum number of iteration for the Newton-Raphson procedure, default is $ 40$.

Output:

betahat
$ p \times 1$ vector, estimate of the regression parameter $ \beta$;
betak
$ \textrm{\tt maxit} \times p$ matrix, iterated parameter values through the Newton-Raphson procedure;
ck
$ \textrm{\tt maxit} \times 1$ vector, values of the convergence criterion at each iteration of the Newton-Raphson procedure.

Example 6.In this example, we compute $ \hat \beta$ for the data in haz01.dat , and estimate the covariance matrix of $ \hat \beta$ by $ \mathbf{I}^{-1}(\hat\beta)$. We use the quantlets 9910 hazbeta and 9913 hazregll . The data was generated from a proportional hazards model with $ \beta = (1, 2)^T$. Details are given in Examples 4 and 5.

  library("hazreg")
  dat=read("haz01.dat")  
  t = dat[,1]                      ; observed times                      
  delta = dat[,2]                  ; censoring indicator                    
  z = dat[,3:4]                    ; covariates                
  {data,ties} = hazdat(t,delta, z) ; preparing data
  {betahat,betak,ck} = hazbeta(data) 
  {ll, ll1, ll2} = hazregll(data, betahat)
  sigma = inv(ll2)                 ; covariance matrix estimate
9917 haz06.xpl

The calculation results in $ \texttt{betahat} =(1.4599, 3.3415)^T$, with the estimated covariance matrix

$\displaystyle \textrm{\tt sigma} =
\left(\begin{array}{cc}
1.019 & 0.55392\\
0.55392 & 1.5847
\end{array}\right).$

Both components $ \beta_1=1$ and $ \beta_2=2$ are within their respective asymptotic 95% confidence intervals that may be constructed with betahat and the square root of the diagonal elements of sigma .


5.3.2 Estimating the Hazard and Survival Functions

We estimate the cumulative baseline hazard function, $ \Lambda_{0}(t)=
\int_{0}^{t}\lambda_{0}(s) ds$, by

$\displaystyle \hat \Lambda_{0}(t) = \sum_{i:\,t_{i}\le t} \
\frac{\delta_{i}}
{\sum_{j\in R(t_{i})}^{} \exp ( \hat\beta^Tz_{j} )}\ .$

The estimate $ \hat \Lambda_0$ is a right-continuous step function, with jumps in the event times. The index $ i$ cycles through all observations, $ i=1, \ldots, n. $ In the case of tied events, each of the events in the tie contributes its own term to the sum; this term is the same for all events in a particular tie. The estimate $ \hat \Lambda_0$ can be derived through a profile likelihood approach, see Klein and Moeschberger (1997), pages 260 and 237, and Johansen (1983).

We estimate the baseline survival function, $ S_{0}(t) = \exp\left\{
-\Lambda_{0}(t)\right\},$ by

$\displaystyle \hat S_{0}(t) = \exp\left\{ -\hat \Lambda_{0}(t)\right\}.$

In the Cox proportional hazards model, the survival function $ S(t\vert z)$ of an individual with covariate values $ z$ is given by

$\displaystyle S(t\vert z) = S_{0}(t)^{\exp\left(\beta^Tz\right) }.$ (5.8)

Consequently, we estimate the conditional survival function by substituting estimates for $ S_0(t)$ and $ \beta$,

$\displaystyle \hat S(t\vert z) = \exp\left\{ -\hat \Lambda_{0}(t) \right\}^{\exp\left( \hat \beta^Tz\right) }.$ (5.9)

Note that the estimates $ \hat\Lambda_0(t)$, $ \hat S_{0}(t)$ and $ \hat
S(t\vert z)$ are all step functions, with jumps at the event times. All three estimates are non-negative, $ \hat\Lambda_0(t)$ is monotonously increasing, and the survival function estimates are monotonously decreasing.

The quantlet 10293 hazcoxb provides the estimates $ \hat\Lambda_0(t)$ and $ \hat S_{0}(t)$. It has the following syntax:

  {bcumhaz, bsurv} = hazcoxb(data)

Input:

data
$ n \times (p+4)$ matrix, the sorted data matrix given by the output data of 10296 hazdat .
Output:

bcumhaz
$ n \times 2$ matrix, with rows $ \left(t_{i}, \hat\Lambda_{0}(t_{i})\right)$, sorted in the same order as the $ t_i$ in data;
bsurv
$ n \times 2$ matrix, with rows $ \left(t_{i}, \hat S_{0}(t_{i})\right)$, sorted in the same order as the $ t_i$ in data.

Example 7.In this example, we calculate and plot estimates of the cumulative baseline hazard $ \Lambda_0(t)$ and of the corresponding survival function $ S_0(t)$, for the simulated data in haz01.dat . The estimates are calculated using the quantlet 10301 hazcoxb . Plotting of the step functions is supported by 10304 steps4plot . The resulting plots are displayed in Figures 5.2 and 5.3. The data in haz01.dat were generated from a proportional hazards model with $ \Lambda _0(t) =t$ and $ S_0(t) = \exp(-t)$; details are given in Examples 4 and 5.

  library("hazreg") 
  dat=read("haz01.dat")  
  t = dat[,1]                            ; observed times                      
  delta = dat[,2]                        ; censoring indicator                       
  z = dat[,3:4]                          ; covariates     
  {data,ties} = hazdat(t,delta, z)       ; preparing data
  {bcumhaz,bsurv} = hazcoxb(data)        ; compute estimates

  setsize(600,400)                       ; initiating graph
  plot1=createdisplay(1,1)               ; initiating graph
  plot2=createdisplay(1,1)
  n = rows(data)                         ; sample size
  pm = (#(1,n+2)'+ (0:n))|(#(2*n+2,3*n+3)'+ (0:n))
                                 ; points to be connected
  cn = matrix(2*n+2)             ; color_num, controls colors
  ar = matrix(2*n+2)             ; art, controls line types
  th = matrix(2*n+2)             ; thick, controls line thickness 

  bsurvline = steps4plot(bsurv, 0~1)     
                                 ; points for step function plot
  setmaskl(bsurvline, pm, cn, ar, th)    ; lines connected
  setmaskp(bsurvline, 4, 0, 8)           ; points controlled

  bcumhazline = steps4plot(bcumhaz, 0~0) 
                                 ; points for step function plot
  setmaskl(bcumhazline, pm, cn, ar, th)
  setmaskp(bcumhazline, 4, 0, 8)

  show(plot1, 1, 1, bcumhazline)      	; plot baseline hazard
  setgopt(plot1, 1, 1, "title","Cumulative Baseline Hazard")
  setgopt(plot1, 1, 1, "xlabel","Time")
  setgopt(plot1, 1, 1, "ylabel","Cumulative Hazard")
  setgopt(plot1, 1, 1, "ymajor", 0.5)
  print (plot1,"hazbcumhaztest.ps")   
          
  show(plot2,1, 1, bsurvline)            ; plot baseline survival
  setgopt(plot2, 1, 1, "title","Baseline Survival Function")
  setgopt(plot2, 1, 1, "xlabel","Time")
  setgopt(plot2, 1, 1, "ylabel","Survival  Function")
  setgopt(plot2, 1, 1,  "ymajor", 0.2, "ylim", (0|1.01))
  print (plot2,"hazbsurvtest.ps")
10310 haz07.xpl

Figure 5.2: Estimate of the cumulative baseline hazard in the proportional hazards model. Data were generated in a model with $ \Lambda _0(t) =t$.
\includegraphics[scale=0.55]{hazbcumhaztest}

Figure: Estimate of the baseline survival function in the proportional hazards model. Data were generated in a model with $ S_0(t) = \exp(-t)$.
\includegraphics[scale=0.55]{hazbsurvtest}

The quantlet 10317 hazsurv provides an estimate of the conditional survival function, $ \hat
S(t\vert z)$, as defined in formula (5.9). It has the following syntax:

   surv = hazsurv(data,z)

Input:

data
$ n \times (p+4)$ matrix, the sorted data matrix given by the output data of 10320 hazdat ;
z
$ p \times 1$ vector, value of the covariates;
Output:
surv
$ n \times 2$ matrix, the first column is the sorted $ t_i$, followed by the estimated conditional survival function $ \hat S(t_i\vert\textrm{\tt z})$.

Example 8. We calculate and plot the estimate $ \hat
S(t\vert z)$ of the conditional survival function for $ z=(0.1, -0.3)$, using the simulated data in haz01.dat . The resulting graph is displayed in Figure 5.4.

  library("hazreg") 
  dat=read("haz01.dat")  
  t = dat[,1]                      ; observed times                      
  delta = dat[,2]                  ; censoring indicator                       
  z = dat[,3:4]                    ; covariates                 
  {data,ties} = hazdat(t,delta, z) ; preparing data
  z1 = 0.1|-0.3                    ; covariate values
  surv = hazsurv(data, z1)            
                     ; estimate conditional survival function
 
  setsize(600, 400)                ; initiating graph 
  plot1=createdisplay(1,1)         ; initiating graph 
  n = rows(data)
  pm = (#(1,n+2)'+ (0:n))|(#(2*n+2,3*n+3)'+ (0:n))
                                   ; points to be connected
  cn = matrix(2*n+2)               ; color_num, controls colors
  ar = matrix(2*n+2)               ; art, controls line types
  th = matrix(2*n+2)              
                     ; thick, controls line thickness

  survline = steps4plot(surv, 0~1)    
                     ; points for step function plot
  setmaskl(survline, pm, cn , ar, th) ; lines connected
  setmaskp(survline, 4, 0, 8)         ; points controlled
  setsize(600,400) 
 
  show(plot1, 1, 1, survline)               
  setgopt(plot1, 1, 1, "title","Conditional Survival Function")
  setgopt(plot1, 1, 1, "xlabel","Time")
  setgopt(plot1, 1, 1, "ylabel","Survival Function")
  setgopt(plot1, 1, 1, "ylim", (0|1.01), "ymajor", 0.2)
  print (plot1,"hazsurvtest.ps")
10327 haz08.xpl

Figure: Estimated conditional survival function based on the data in haz01.dat , for $ z=(0.1, -0.3)^T$.
\includegraphics[scale=0.55]{hazsurvtest}


5.3.3 Hypothesis Testing

The quantlib hazreg offers three tests for hypotheses about regression parameters: the likelihood ratio test, Wald's test and the score test. Assume that $ \beta = (\beta_{1}^T, \beta_{2}^T)^T$, where the $ q$-dimensional subvector $ \beta_{1}$ consists of the regression coefficients of interest, and the $ (p-q)$-dimensional subvector $ \beta_{2}$ contains the remaining parameters. We are testing the hypotheses $ H_{0}\!\!:\, \beta_{1}=\mathbf{0} \ $ versus $ H_{a}\!\!:\, \beta_{1}\neq \mathbf{0}$, in the presence of the remaining unknown $ (p-q)$ regression coefficients; $ \mathbf{0}$ denotes the $ q$-dimensional zero vector. This type of tests is often used in model choice procedures, testing whether a given model can be improved by including certain additional covariates or covariate combinations.


5.3.3.1 Likelihood Ratio Test

The test statistic for the likelihood ratio test is given by

$\displaystyle T_{LR} = 2 l(\hat\beta) - 2 l(\hat\beta_{0}),$

where $ \hat
\beta_{0}=({\mathbf 0}^T,\hat\beta_{2}^T)^T$, and $ {\mathbf 0}$ and $ \hat\beta_{2}$ are the $ q$-dimensional zero vector and the conditional maximum likelihood estimate for $ \beta_{2}$, given $ \beta_{1}={\mathbf 0}$, respectively. The estimate $ \hat\beta_{2}$ is obtained by substituting the fixed null hypothesis value, $ \beta_{1}={\mathbf 0}$, for the corresponding $ \beta$'s in the partial log-likelihood function (5.5).

Under the null hypothesis, the asymptotic distribution of $ T_{LR}$ is $ \chi^{2}_{q}$. We calculate $ p$-values as tail probabilities of the $ \chi^{2}_{q}$ distribution, $ P(\chi^{2}_{q} \ge T_{LR})$.

5.3.3.2 Wald Test

Let $ \hat \beta = (\hat\beta_{1}^T,
\hat\beta_{2}^T)^T
$ denote the usual maximum partial likelihood estimate of the full parameter vector $ \beta = (\beta_{1}^T, \beta_{2}^T)^T$. Now, let us partition the inverse of the information matrix $ \mathbf{I}(\beta)$ into

\begin{displaymath}
\mathbf{I}^{-1}(\beta) = \left(
\begin{array}{cc}
\mathbf{I...
...^{12} \\
\mathbf{I}^{21} & \mathbf{I}^{22}
\end{array}\right),\end{displaymath}

where $ \mathbf{I}^{11}$ denotes the $ q\times q\,$ submatrix corresponding to $ \beta_{1}$. The information matrix is defined in (5.8). The test statistic for the Wald test is given by

$\displaystyle T_{W}=
\hat\beta_{1}^T\
\left[\mathbf{I}^{11}(\hat\beta)\right]^{-1}\,\hat\beta_{1}.$

Under the null hypothesis, the distribution of $ T_{W}$ converges to $ \chi^{2}_{q}$.

5.3.3.3 Score Test

Let $ U_{1}(\beta)$ denote the subvector of the first $ q$ elements of the score function $ U(\beta)$, defined in (5.6). The test statistic for the score test is

$\displaystyle T_{SC} = U_{1}(\hat \beta_{0})^T\,\mathbf{I}^{11}(\hat \beta_{0})\
U_{1}(\hat
\beta_{0}),$

where $ \hat \beta_{0}=(\mathbf{0}^T,\hat\beta_{2}^T)^T$ is the maximum likelihood estimate for $ \beta$ under the null hypothesis, introduced in Section 5.3.3. Again, the large sample distribution of the test statistic under the null hypothesis is $ \chi^{2}_{q}$.

For more details on hypothesis testing in the Cox model, see Klein and Moeschberger (1997), Section 8.4.

5.3.3.4 Implementation

Values of the three test statistics $ T_{LR}$, $ T_{W}$ and $ T_{SC}$, and the corresponding asymptotic $ p$-values are provided by the quantlet 10658 haztest . $ p$-values are computed as tail probabilities of the $ \chi^{2}_{q}$ distribution, which is the asymptotic distribution for each of the three tests. The syntax of 10661 haztest is given below:

   {ttest, val, df, pval} = haztest(data,index)

Input:

data
$ n \times (p+4)$ matrix, the sorted data matrix given by the output data of 10664 hazdat ;
index
$ p \times 1$ vector, with index[i]=0 when $ \beta_i = 0$ is part of the null hypothesis, and index[i]=1, otherwise;
Output:
ttest
printed output, table with values of the test statistics, degrees of freedom and $ p$-values for the likelihood ratio test, Wald's test and the score test.
val
$ 3 \times 1$ vector, values of the test statistics, in the following order: likelihood ratio test, Wald's test, score test;
df
scalar, degrees of freedom of the $ \chi^2_q$ reference distribution;
pval
$ 3 \times 1$ vector, $ p$-values of the tests.

Example 9. We are testing the null hypothesis $ H_0\!\!: \beta_2=0$ for the data in haz01.dat . The quantlet 10669 haztest provides values for three test statistics and computes the corresponding $ p$-values as tail probabilities of the $ \chi^{2}_{1}$ distribution.

  library("hazreg") 
  dat=read("haz01.dat")  
  t = dat[,1]                        ; observed times                      
  delta = dat[,2]                    ; censoring indicator                       
  z = dat[,3:4]                      ; covariates              
  {data,ties} = hazdat(t,delta, z)   ; preparing data
  index = 1|0                        ; testing if the second
                                     ; coefficient is zero
  {testt, val,df,pval} = haztest(data, index)
  testt                              ; print summary table
10673 haz09.xpl

The variable testt contains the summary table. The last line of the code prints the table into the XploRe output window:

 "--------------------------------------------------------------"
 "Cox Proportional Hazards Model"
 ""
 "Hypothesis: beta1=0 for a subvector of regression coefficients"
 ""
 "- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - "
 "                  Test statistic     DF        P-value        "
 "- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - "
 "LR Test              7.56687         1         0.00595"
 "Wald Test            7.04612         1         0.00794"
 "Score Test           4.25763         1         0.03908"
 "--------------------------------------------------------------"
 ""

Additionally, the test statistic values and the $ p$-values are stored in the variables val and pval, respectively. The data in haz01.dat were generated from a proportional hazards model with $ \beta = (1, 2)^T$. For this sample, all three tests result in small $ p$-values, providing evidence against the null hypothesis $ H_0\!\!: \beta_2=0$.


5.3.4 Example: Length of Stay in Nursing Homes

Nursing homes provide both short-term and long-term care, and patients may stay from a few days to several years. In this section, we investigate how certain characteristics of nursing home patients influence their length of stay. We use a subset of the nursing home data presented by Morris, Norton, and Zhou (1994). The original study was sponsored by the National Center for Health Services in 1980-1982, and investigated the the impact of certain financial incentives on the nursing home care of Medicaid patients. Thirty six nursing homes were randomized into a treatment or control group; nursing homes in the treatment group received financial incentives for admitting more disabled Medicare patients, for improving their health status, and for discharging the patients to their homes within 90 days. The nursing home data consist of $ n=1,601$ patients from this study. Patients were admitted to participating nursing homes between May 1, 1981, and April 30, 1982, and followed over a period of up to three years. The following characteristics were recorded: age, marital status, gender, health status. The data set is available on a floppy disk distributed with Lange et al. (1994), and at StatLib, http://www.stat.cmu.edu/datasets/csb/ .

We restrict our analysis to nursing homes in the control group, which represents the standard care without additional financial incentives, and to patients that were at least 65 years old and of comparatively good health (health status = 2). Our subset consists of $ n=214$ patients. The patients were followed from admission to either discharge, or death. For patients that were still in a nursing home on April 30, 1983, the length of stay is recorded as censored (25 % of the observations).

Our subset of the nursing home data is provided in nursing.dat . The first column of the data file contains the length of stay in the nursing home (in days), the second column is the censoring indicator, and columns 3-5 contain the age (in years), the gender (1=male, 0=female), and the marital status (1=married, 0=unmarried), respectively. Twenty one percent of the patients are male, and 14 % are married.

In order to identify the impact of age, marital status and gender on the length of stay in a nursing home, we are fitting a Cox proportional hazards model with $ p=3$ covariates: agespline, gender and married. The first variable measures the age of a patient as $ \ \textrm{\tt agesp}=\min(\textrm{\tt age}, 90) - 65;\ $ this transformation was suggested in Morris, Norton, and Zhou (1994). The other two covariates are indicator variables for the gender and marital status, respectively. The second column of Table 5.1 provides the sample means of the covariates and the standard deviation of agespline.


Table: Covariates in a Cox proportional hazards model fitted to the data in nursing.dat . The time-to-event is the length of stay of patients in a nursing home (in days).
Covariate Mean $ {\hat\beta}$ LR Test Wald Test Score Test
  (SD) SE($ \hat \beta$) $ p$-value $ p$-value $ p$-value
agespline 15.6 -0.052 0.00001 0.00001 $ 0.00001$
  (7.06) (0.011)      
gender 0.210 0.037 0.86 0.86 0.09
    (0.213)      
married 0.140 0.040 0.87 0.87 0.12
    (0.246)      



The following code reads in the data and calculates estimates of the regression coefficients, $ \hat \beta$, and their covariance matrix:

  library("hazreg")
  dat=read("nursing.dat")                ; read data from file
  t = dat[,1]                            ; time = length of stay
  delta = dat[,2]                        ; censoring
  age = dat[,3]                          ; covariate AGE
  gender = dat[,4]                       ; covariate GENDER
  married = dat[,5]                      ; covariate MARRIED
  limit = matrix(rows(dat), 1)*90        ; transform AGE
  agespline = min(age~limit,2) - 65
  {data, ties} = hazdat(t, delta, agespline~gender~married)  
                                         ; prepare data
  {betahat, betak, ck}=hazbeta(data)     ; estimate beta 
  {ll, ll1, ll2} = hazregll(data,betahat) 
  sigma = inv(ll2)                       
                                  ; covariance matrix of betahat

Table 5.1 presents the estimated regression coefficients in the fitted model ($ \hat \beta$, the value of betahat) and their estimated standard deviation, SE($ \hat \beta$), obtained as square root of the diagonal elements of sigma.

Let us test, for each of the covariates, whether it contributes to the hazard in the presence of the other two variables:

  {ttest1, val1, df1, pval1} = haztest(data, (0|1|1)) 
                                            ;test for AGESPLINE 
  {ttest2, val2, df2, pval2} = haztest(data, (1|0|1)) 
                                            ;test for GENDER 
  {ttest3, val3, df3, pval3} = haztest(data, (1|1|0)) 
                                            ;test for MARRIED
11001 XAGhaz10.xpl

The variables ttest1, ttest2 and ttest3 contain the summary tables for the covariates agespline, gender, and married, respectively. The $ p$-values are provided in Table 5.1.

The only covariate with a significant contribution to the hazard is the age. In comparison, in Morris, Norton, and Zhou (1994), a Cox model is fitted to all $ n=1,601$ observations, with additional variables that identify the health status at entry and the treatment group. Here, the age does not appear to be significant, while gender, marital status and poor health significantly contribute to the length of stay.

These results are an example that caution is advised in interpreting fitted models. In our case, gender and marital status are correlated with the health status: married patients tend to enter the nursing home with more advanced health problems, and men are more likely than women to be admitted in poorer health. In our restricted data set of patients with similar, good health at entry, neiter gender nor marital status are helpful for modeling the expected length of stay in the framework of proportional hazards.