Hello everyone,
I would like to ask a few questions about Cox regression and how to
assess its assumptions on SPSS. Perhaps someone who has carried out Cox
regression can help.
Firstly some background....
1) Background about the Proportional Hazard Assumption
The cumulative hazard function is
h(t) = h_o(t) * e^(beta_1 x1 + b_2 x2 +.....)
the baseline hazard is h_o(t).
Cumulative Survival function is S(t)=exp(- h(t))
The model is called the "proportional hazards model" because for two
patients, the ratio of their hazards will be constant for all time
points.
For example if you had patients with the same age with presence of
characteristic A but different stages of disease, the ratio of the
estimated hazard rates across all time points is constant at e^beta
where the regression coefficient is for the case with stage coded as 1.
According to Paramar and Machin ("Survival Analysis") we could plot the
log(-log) value of the survival function against (log of) time for the
distinct covariate patterns we are dealing with to assess if the
proportional hazard assumption holds.
So, if we focus on 'stage', we could check the proportional hazard
function was true for this variable in SPSS by using the "plots" option
in SPSS to plot separate log(-log) survival curves/"lines" for pattern
1) where stage=0 and pattern 2) where stage=1. We could do similar
plots to check the proportional hazard assumption for presence and
absence of 'characteristic A'. Parallel lines indicate proportional
hazards (Paramar and Machin, p140). Note that the same baseline
function is used to generate the different lines.
2) Background about stratification
We can also establish if the model should be stratified in SPSS by
splitting the data into strata to generate several separate hazard
baseline functions, one for each stratum.
One set of coefficients is generated regardless of stratum. The value
of the hazard functions in both strata are calculated using the same
set of variables e.g. if the data was stratified by 'sex', the hazard
function for those with characteristic A and characteristic B would be
generated for both males and females over all time points.
Again, we examine SPSS's 'log minus log' against t plot to see if the
ratio of the hazard functions for the two patient groups is constant
over time. Parallel lines signify that this is true. If this is the
case, then the variable used to form the strata ('sex' in our example)
can be used in the model and a common baseline hazard function can be
estimated for all of the groups.
Questions:
I find the SPSS manual a little confusing as regards how to establish
if a specific effect is constant over time (i.e whether a time dependent
covariate is in existence). The example in the manual I am reading
finds that the data should be stratified by treatment (where treatment
takes the values 0 or 1). It then tries a model
H(t) = h_o(t) e^ (B_1 *treat + B_2 *treat*t_cov)
The manual says that "whenever you want to test that hazards are
proportional for different strata, you incorporate the
time-by-stratification-variable interaction. If the coefficient for
this term is significant then the hazards are not proportional." Could
anyone explain what this means please?
Also, I'd like to know if we can assess if a time dependent covariate
should be added by looking at plots? If so, which plots? I would say
that the plots described above (in 1) to assess the proportional hazards
assumption would be the ones to look at as these make use of the index,
beta_x. Non parallel lines would indicate that a predictor depends on
time. Do you agree?
Finally some books state that the log(-log) plots should be against t;
some say against log of t. SPSS plots against t...does anyone know the
reason for the descrepency?
Many thanks again,
All the Best,
Kim.
|