>
> Hello
>
> If anyone could shed some light on my regression confusion I would be v
> grateful...
>
> I am investigating the interaction between a continuous variable (moderator) and
> a dichotomous variable (treatment / no treatment) on a continuous dependent
> variable using regression analyses.
>
OK.
> I am coding my dichotomous variable treatment = 1 and no treatment =0, centering
> the continuous variable by subtracting the sample mean of the variable from
> each score, and creating an interaction term by multiplying the two together.
> In the regression analyses, I am entering the dichotomous variable (0/1 coded),
> then continuous moderator variable (centered) and then the interaction of the
> two (0/1 coded x centered) in three separate steps.
>
Two things: First, it's not really necessary to centre the continuous
variable (although it doesn't hurt), and second, you don't need to do
three steps, although it won't hurt either.
> I'm not sure if this is correct, as after reading around there seem to be many
> variations on this procedure and I'm not sure how much these variations matter
> and whether they are a matter of preference or there are definitely right and
> wrong ways depending on different scenarios.
>
The variations hardly matter at all. The biggest variation is about
centring, or not. The definited textbook on this sort of thing is by
Aiken and West, and is called Multiple regression: detecting and
interpreting interactions (or something like that). They made a
moderate fuss about centring, which has been taken to heart by lots of
people (including me, in my book on regression). This can matter
sometimes, but extremely rarely - I've never found an occasion when it
mattered. I have heard on the grapevine that A & W are working on a
2nd edition of the book, and they're going to clarify that.
> Am I right to use a code of 0 and 1 for treatment / no treatment, and to think
> that which group is coded 0 and which group 1 will make no difference to the
> results?
Yes. It will change the interpretation of the results though.
> Some papers talk of 'effects coding' +1 and -1 for the groups – when and why is
> this used? How does it differ from the 0/1 coding?
>
When you do 0, 1 (dummy coding) without interactions, then the effect
becomes the difference between the group, and the reference category -
you've got a sensible reference category here, it's the control group.
However, if you've got multiple groups, and no sensible reference
category, this doesn't make as much sense. In my book (again!) I used
the example of:
Primary school teachers
High school teachers
College Teachers
"New" university lecturers
"Old" university lecturers
Which is the reference category that you should compare the rest
against? Well, there isn't an obvious choice - you could pick any.
So, what you do is use effect coding, which compares each one against
the mean of all of them. That's often a more sensible choice - you
just want to know who is higher than the mean, and who is lower than
the mean.
When you extend to interactions, it works the same way. Let's say you
have a dichotomous variable (d) and a continuous variable (c), and so
you enter d, c, and dc into the regression. The parameter (B, beta,
slope, whatever you want to call it) for c becomes the slope for the
reference category - that is it's the slope for the control group.
The parameter dc is the difference between the slope for the control
group and the slope for the treatment group. The confidence intervals
and p-value are relating to this difference.
Using effect coding instead of dummy is exactly the same thing - but
instead of comparing the slopes to each other, it compares them with
the mean slope.
> Some people talk of centering the dichotomous variable as well as the continuous
> moderator variable – I don't understand this, and did not think it needed
> doing?
>
No, it doesn't. Again, it won't hurt if you do. In fact, you don't
need to pick 0, 1, you can pick any two values you like: -53 and 12.2,
or 23 and 56,454. All p-values will be the same, it's just the
interpretation that's different. (We choose 0, 1 because we have to
multiply by these two values to interpret, and you don't get any
easier than multiplying by zero and 1).
> Am I right to think you should use the centered version of the continuous
> moderator variable to test for its main effect as well as using it to generate
> the interaction term?
>
Doesn't matter. Can if you want. The reason to do it is that it
alters the interpretation of the d parameter.
Jeremy
--
Jeremy Miles
Learning statistics blog: www.jeremymiles.co.uk/learningstats
Psychology Research Methods Wiki: www.researchmethodsinpsychology.com
|