I would like to apply a logistic regression model to investigate possible
aetiological factors for a disease (B), which follows after a specific event
(A).
For all subjects n = 1400 (162 have B) we have records for minimum 1 y before
the event of A, but for many subjects we have information dated much further
back. We would like to include as much information into the model as possible
but I have my concerns.
For example we want to see if disease C is associated to disease B. The chance
a subject has a record of C is bigger if we have 10 y of data rather than just
2 y.
For other variables such as blood pressure and body mass index we want to use
data obtained within the last 5 years and then use the data closest to the
onset of event A. Again we will be more likely to get this information if we
have records dated 5 y back than if we only have data from the last year.
If you have been in a similar situation, can suggest some references or
possible solutions I would be pleased to hear about it.
Irene
--
Dr Irene Petersen
Research Statistician
Psychological Medicine
St.Bartholomews Hospital
London EC1A 7BE, UK
Phone +44 (0)207 601 8511
|