Dear Allstaters,
I'm quite new to Longitudinal (Cross-sectional Time series) analysis, and
I have a dataset at hand that requires testing using the technique. After
consulting a medical statistics book (Everitt, Modern Medical Statistics),
I've understood that there are two main approaches to such data - the
Random-effects model and the Generalized Estimating Equations method.
There's also a third method which is not quite x-sectional time series,
and that's Generalized linear model with robust variance with clusters,
executable with STATA.
I have statistical knowledge at undergraduate level, but not quite enough
to understand the theory behind the robust variance. I also don't know
which method (random-effects or GEE) is better (a more accurate test) for
my purposes.
I've tried all the analyses and they all produced slightly different
results, in terms of p-values (ranging from 0.045 - 0.080). Which one is
best?
I have the following background information though:
The dataset has 212 subjects, and I'm looking for association between two
dichotomous variables - both of which are collected over 4 time-points.
The first variable (obligatory care) is positive for around 70% of the
subjects in the first time-point and decreases to around 60% in the last
time-point, The second variable (whether they take their medicine) is
positive for around 90% of the subjects, and is fairly constant over time.
There are drop-outs over time, but most (around 70% have complete data).
Crosstabulations show that around 92% of people who have obligatory care
(1st variable) also take their medicine, and this is fairly constant over
time. For people who don't have obligatory care, it is about 85% (actually
their may be a trend which shows that it increases from 82% to 88%).
Running a random effects model (random intercept) with interaction between
time-points (continuous) and independent variable (obligatory care), I
found no evidence of an interaction.
Running a GEE (logistic) with unstructured correlations, I found that the
correlations estimates between the different time-points to be pretty
constant as well at around 31% (ranging from 25%-35%) with no evidence of
an Autoregressive correlation structure.
Now, given such data, if I want to test for an association between the two
variables, which model should I use? And why?
I also know that the interpretation of the coefficients is different
between the random-effects model and the GEE, but I presume for the
purpose of hypothesis testing, this is irrelevant. Am I correct?
Thank you very much for your help.
Yours,
Tim Mak
|