Dear Alessandro,
An interesting question. A few thoughts.
Case-control studies are a bit odd. They are almost always
done with some level of matching - this may be explicit, in
a formal matched study, and/or implicit, in the eligibility
criteria for cases and controls. In essence you construct
two biased samples, with very different sampling weights
(close to 100% for cases, and usually much much smaller for
controls), with (possibly) matching-induced confounding
between the matching variables and the case-control status
(at least). Analysis of CC studies makes allowance for all
of this.
So, what can you say, about what populations, from such a
sample?
I can see lots of questions which are answerable, but lots
more which are not. I don't see why your question is not
answerable, but it would be essential to know all about how
the cases and controls were collected. At a minimum you will
need to include all the matching variables, explicit and
implicit in every analysis, whether they are significant or
not. I don't personally know of any analyses similar to
yours, but the approach you suggest is sound.
However, to make inferences about any population beyond the
study participants, the issue of weighting raises its ugly
head. Each control represents typically thousands of people,
while each case typically represents one in the underlying
population. You need to show that this does not affect your
conclusions, which might not be possible.
I think you can do something interesting, but you will need
to think hard, and do a lot work to show that it means anything.
Best of luck!
Anthony Staines
On 12/29/10 09:13, Alessandro Marcon wrote:
> Dear all
>
>
> I wish to perform a "secondary" analysis on data collected in a
> multicase-control design (where the primary aim was to
> investigate the
> association between genetic determinants and respiratory
> diseases).
> My aim is to study the association between the case-control
> status (main
> independent covariate) and a continuous measure of exercise
> capacity
> (dependent covariate), while adjusting for several potential
> confounders
> (gender, age, smoking status, etc). I am currently using a
> standard
> linear multiple regression model with exercise capacity = Y
> and the
> case-control status and the other potential confounders as Xi.
>
> Do you think that the above statistical analysis is correct,
> and do you
> have any reference to support that?
>
> I have found some reference on re-using data from
> case-control studies
> by logistic regression (1,2), but no reference to the use of
> linear
> regression models.
>
>
> References:
> 1) Lee AJ, McMurchy L, Scott AJ. Re-using data from
> case-control
> studies. Stat Med. 1997 Jun 30;16(12):1377-89.
> 2) Nagelkerke NJ, Moses S, Plummer FA, Brunham RC, Fish D.
> Logistic
> regression in case-control studies: the effect of using
> independent as
> dependent variables. Stat Med. 1995 Apr 30;14(8):769-75.
>
>
> Best wishes to all!
> Alessandro Marcon
>
>
--
Anthony Staines, Professor of Health Systems Research,
School of Nursing, Dublin City University, Dublin 9,Ireland.
Tel:- +353 1 700 7807. Mobile:- +353 86 606 9713
You may leave the list at any time by sending the command
SIGNOFF allstat
to [log in to unmask], leaving the subject line blank.
|