Dear All,
I am analysing data using logistic regression and I have an issue with missing data in the covariates.
The covariates are categorical. I have 100 events, but if I use the covariates as indicated by my colleagues, the number of events is reduced to 40 events because of missing data. We have to produce a strategy for including the patients with missing data as we already are in danger of over fitting because we have more than 25 covariates.
Somebody has suggested to recode the missing values (in SAS at the moment they are .), to an artificial level of 99 and include all patients in the analysis. I wonder what is the impact of making this artificial level? I have checked as much as possible that the missing values are missing at random and balanced over treatment groups.
My colleagues are not in favour of any multiple imputation, worst case scenario imputation, etc. It is not possible to use LOCF here.
All comments and suggestions are welcomed.
Please reply to me and not to the list: [log in to unmask]
Thanks in advance,
Anna Passera
|