>>> Marie-Lorraine APPERT <[log in to unmask]> 1/22/2004
12:22:31 PM >>> wrote
<<<
Thanks a lot for all your answers!
>>>
You're welcome. That's what the list is for.
More comments below
<<<
Methodology:
we have censored data, so I've done a Kaplan-Meier univariate analysis
and
then a Cox multivariate analysis.
Y= duration of survival
X1= factor1
X2=factor2
Y=X1+X2+other factors
I know that I don't have to put both factors in the multivariate model
(because of the high correlation), but "I would like to" because the
first
one is already known like having an effect on survival (it is the age
of the patient and it wouldn't be realist not to put it in the model),
so my
interest is to know if the second factor has an effect too
(independantly of age), or if the effect that I see is just a
consequence of the correlation with the age
>>>
What's the second variable?
If you want to know whether the second variable has an effect
independent of age, you do, indeed, need to have both in the model.
The problem then becomes whether the resultant equation is properly
estimable (see below comments on collinearity)
<<<
Somone else wrote me that the most important was the colinearity (and
multicolinearity) and not the correlation. But colinearity is a
particular case of correlation?????
So do you think that there is no problem to put both factors in a Cox
model if there is high correlation but little
colinearity????????????????????
>>>
First, collinearity and multicollinearity are (usually) synonyms.
Second, collinearity is NOT a particular case of correlation. The two
are, indeed, related. BUT, you can have high collinearity with low
correlation; you can also have low collinearity with high correlation.
Since, for substantive reasons, you need to include both of these
factors (age and whatever), the key issue becomes whether the
collinearity between them makes the analysis untenable. The correct
tool for doing this is collinearity diagnostics; not a correlation
matrix. This is so whether you are doing OLS regression, Cox, logistic,
or what have you.
If the collinearity diagnostics indicate an absence of problematic
collinearity, you are home-free. If they show a problem, then you have
a couple options:
If you have a lot of data, you could stratify by age and run separate
analyses.
You could try ridge regression
You could try to collect more data.
HTH
Peter
Peter L. Flom, PhD
Assistant Director, Statistics and Data Analysis Core
Center for Drug Use and HIV Research
National Development and Research Institutes
71 W. 23rd St
www.peterflom.com
New York, NY 10010
(212) 845-4485 (voice)
(917) 438-0894 (fax)
|