Print

Print


Hello everyone,

I hope that someone can shed some light on the topic below.

In a case-control study, I would like to ask your opinion about the direction of the relationship between the risk factor and the outcome.

Let us consider, initially, one risk factor.

NON MATCHED STUDY

Say we have a non matched case-control study where we enrol controls without regard to the number or characteristics of the cases.   We will assume that the cases have lung cancer and the controls do not have lung cancer.  Our risk factor is “does the subject smoke” (yes or no).
Case-control studies are ‘retrospective’ i.e.  in this study, we sample subjects whose lung cancer status is already known and then establish how many in each of the outcome groups smoked.  Thus, in my way of thinking, the odds ratio would be expressed as “the odds of *smoking* when the subject has lung cancer compared to the odds of *smoking* when the subject has not lung cancer”.
If, in a hypothetical example, we had 50 cases and 100 controls and 40 of the cases smoked and 30 of the controls smoked, then the “the odds of *smoking* when the subject has lung cancer compared to the odds of *smoking* when the subject has not lung cancer” = [40/10]/[30/70] = 9.33.
(Admittedly, this is mathematically equivalent to  the “the odds of *lung cancer* when the subject smokes compared to the odds of *lung cancer* when the subject does not smoke” = [40/30]/[10/70] = 9.33)

MATCHED STUDY
Say we now consider a 1:1 matched case-control study where we enrol each control based on some characteristics of a case i.e. we match upon potential “confounding” variables e.g. sex and gender.   We’ll say  that we matched the cases and controls for sex and gender in the following hypothetical example. There were 45 pairs for which the lung cancer patient but not the non lung cancer patient was a smoker and there were 24 pairs where the non lung cancer patient but not the lung cancer patient was a smoker.  Thus the odds ratio for these data is 45/24 = 1.875 i.e. if you have lung cancer you’re 1.875 times more likely to be a smoker than if you don’t have lung cancer (even when we control for sex and gender).

MODELLING
Now for a non-matched case-control study, we would carry out an unconditional logistic regression and for a matched case-control study we would carry out a conditional logistic regression.  Say our primary risk factor was smoking and our potential confounders were sex and gender, then (i) the sex and gender variables would appear in the unconditional logistic regression model for an adjusted estimate of “smoking” and (ii) sex and gender would *not* appear in the conditional logistic regression model as we had already matched the cases and controls as regards these two variables. 

Q1My question is regarding the interpretation of the “odds ratios”…..as I have touched upon above, for a case-control study, we already have the two-group outcome (e.g. lung cancer and no lung cancer) and we are looking to see how many in each of the two groups have the a risk factor (e.g. smoking)….so really the odds ratio is the “odds of the risk factor (smoking) when we have a “case” (lung cancer) compared to odds of the risk factor (smoking) when we have a “control” (non lung cancer)”…..however, when interpreting odds ratios  for a case-control study (in an unconditional or conditional logistic regression model scenario),  it *is* actually perfectly correct to interpret the odds ratio as the “odds of the case (lung cancer) when we have the risk factor (i.e. “smoking”) compared to the odds of the case (lung cancer) when we do not have the risk factor (i.e. “not  smoking”)….(controlling for all other variables in the model). That is correct isn't it? This seems to tie in with all of the examples I have looked at for unconditional or conditional logistic regression.

Q2 When we are doing a matched case-control study, the variables that we use for matching (i.e. the confounders) are those which (in a non matched case-control study) are thought to be associated with the risk factor of interest (smoking) and the outcome (lung cancer).  In  a non matched case-control study say the analysis reveals a relationship between smoking and lung cancer, but there may be more with lung cancer who are older (compared to non lung cancer) and it could be that a higher proportion of smokers are elderly (compared to the non smokers)....therefore  it would make sense in this scenario to match each case-control pair as regards age (as failure to do so would result in a biased estimate of the effect of smoking).  I am correct in my thoughts aren’t I? 

Many thanks for your views on the above.  I am sure that I am correct in my assumptions but I thought I'd double check.

Kind Regards,
Kim


Dr Kim Pearce PhD, CStat, Fellow HEA
Senior Statistician
Faculty of Medical Sciences Graduate School 
Room 3.14
3rd Floor 
Ridley Building 1
Newcastle University
Queen Victoria Road 
Newcastle Upon Tyne 
NE1 7RU

Tel: (0044) (0)191 208 8142

You may leave the list at any time by sending the command

SIGNOFF allstat

to [log in to unmask], leaving the subject line blank.