Next: 12.4 Multiple Failures and Up: 12. Computational Methods in Previous: 12.2 Estimation of Shape

Subsections

# 12.3 Regression Models

Survival analysis is now a standard statistical method for lifetime data. Fundamental and classical parametric distributions are also very important, but regression methods are very powerful to analyze the effects of some covariates on life lengths. [6] introduced a model for the hazard function with survival time  for an individual with possibly time-dependent covariate , i.e.,

 (12.17)

where is an arbitrary and unspecified base-line hazard function and and . Cox generalized (12.17) this to a discrete logistic model expressing y as

 (12.18)

[17] compared the estimators of regression parameters in the proportional hazards model (12.17) or (12.18) when we take the following methods; the Breslow-Peto ([1,28]) method, the partial likelihood ([6], [7]) method and the generalized maximum likelihood method ([15,22]).

## 12.3.1 The Score Test

In many applications it is necessary to test the significance of the estimated value, using for example the score test or the likelihood ratio test based on asymptotic results of large sample theory. First we express the three likelihood factors defined at each failure time as , , corresponding to the Breslow-Peto, the partial likelihood and the generalized maximum likelihood methods, respectively;

 (12.19) (12.20) (12.21)

where denote covariate vectors for individuals at risk at a failure time and correspond to the failures, and denotes the set of all subsets of size from . The overall likelihood obtained by each method is the product of these cases of many failure times. It can be shown that the first derivatives of the three log likelihoods with respect have the same values, i.e.,

at .

The Hessian matrices of the log likelihoods evaluated at are respectively,

where is a matrix whose elements are defined by

The first two results were derived by [12]. Maximizing out from gives the last one, which is obtained in an unpublished manuscript. Since

we conclude that the Breslow-Peto approach is the most conservative one.

## 12.3.2 Evaluation of Estimators in the Cox Model

[12] pointed out in their simulation study that when the discrete logistic model is true the Breslow-Peto method causes downward bias compared to the partial likelihood method. This was proven in [17] for any sample when is scalar-valued, i.e.,

Theorem 2
Let be the maximum likelihood estimator of and be that of . Suppose that all 's are not identical. Then both and are unique, if they exist, and and

 (12.22)

The equality in (12.22) holds when is equal to zero or the number of ties is equal to one.

Corollary 1 ([17])
The likelihood ratio test for against is also conservative if we use the Breslow-Peto method. The statement is also valid in the multivariate case.

This theorem and corollary confirm the conservatism of the Breslow-Peto approximation in relation to Cox's discrete model ([27]).

## 12.3.3 Approximation of Partial Likelihood

[31] proposed an approximation method using full likelihood for the case of Cox's discrete model. Analytically the same problems appear in various fields of statistics. [30] and [11] remarked that the inference procedure using the logistic model contains the same problems in case-control studies where data are summarized in multiple or tables. The proportional hazards model provides a type of logistic model for the contingency table with ordered categories ([29]). As an extension of the proportional hazards model, the proportional intensity model in the point process is employed to describe an asthma attack in relation to environmental factors ([19,31]). For convenience, although in some cases partial likelihood becomes conditional likelihood, we will use the term of partial likelihood.

It is worthwhile to explore the behavior of the maximum full likelihood estimator even when the maximum partial likelihood estimator is applicable. Both estimators obviously behave similarly in a rough sense, yet they are different in details. Identifying differences between the two estimators should be helpful in choosing one of the two.

We use the notation described in the previous section for expressing the two likelihoods. Differentiating gives

Differentiating with respect to and allows obtaining the maximum full likelihood estimator, i.e.,

and

From the latter equation is uniquely determined for any fixed . Using , we define

The maximum full likelihood estimator, , is a root of the equation . We denote by for simplicity.

Note that the entire likelihoods are the products over all distinct failure times . Thus the likelihood equations in a strict sense are and , where the summations extend over in . As far as we are concerned, the results in a single failure time can be straightforwardly extended to those with multiple failure times. Let us now focus on likelihood equations of a single failure time and suppress the suffix .

Proposition 1 ([31])
Let be either of or . Denote by , and and by and respectively, where are ordered covariates in ascending order. accordingly has the following four properties:

1. .
2. is negative for any , that is, is strictly decreasing.
3. .
4. .

Extension to the case of vector parameter is straightforward. From Proposition 1 it follows that if either of the two estimators exists, then the other also exists and they are uniquely determined. Furthermore, both the estimators have a common sign.

Theorem 3 ([31])
Suppose that . The functions and then have a unique intersection at . It also holds that for . The reverse inequality is valid for .

The above theorem proves that for the case of .

To quantitatively compare the behaviors of and , their their power expansions are presented near the origin. Since both functions behave similarly, it is expected that the quantitative difference near the origin is critical over a wide range of . Behavior near the origin is of practical importance for studying the estimator and test procedure.

Proposition 2 ([31])
The power expansions of and near the origin up to the third order are as follows: for ,

1. ([5])

where , and .

The function has a steeper slope near the origin than . The relative ratio is , which indicates that is close to near the origin. The power expansion of is expressed by

 (12.23)

where and are coefficients of order 2 and 3 of . Although is defined to adjust the coefficient of of order 1 to that of , the coefficient of order 2 of becomes closer to that of than that of . The following approximations are finally obtained.

 (12.24) (12.25)

The proposed approximated estimator and test statistic are quite helpful in cases of multiple table when the value of both and are large ([31]).

Next: 12.4 Multiple Failures and Up: 12. Computational Methods in Previous: 12.2 Estimation of Shape