Thank you very much to all those who offered suggestions in response to my posting 'Evaluation of interactive effects when main effects are not significant'. I have not been able to reply to all of you individually but the response was very encouraging and your contributions were appreciated. A few individuals expressed an interest in having your suggestions circulated. I have therefore endeavoured to provide a summary below. I have also included my original message (with a clarification included on what I meant by significant).
I shall try to read these suggestions collectively as each has a special contribution to make to the discussion. Please feel free to get back to me if you have any further relevant comments to offer.
Best wishes
Margaret
***********************************************************************
Original message:
Dear Statisticians
When building a multivariate model to test for associations between an outcome variable and multiple factors, I tend to avoid testing for a significant interactive effect between two factors if I have already found that the main effects for the corresponding factors are not statistically significant. [I SHOULD ADD HERE THAT WHEN PROCEEDING IN THIS WAY, I USE A VERY HIGH SIGNIFICANCE LEVEL TO TEST FOR MAIN EFFECTS AT THIS SCREENING STAGE.]
Nevertheless, I am conscious that there is some lack of consensus amongst statisticians as to whether this is an appropriate approach to take. I would be interested therefore to receive individual views (together with explanations) as to whether or not the above approach is justifiable.
The above issue is strongly related to the question of what to conclude if an interactive effect proves to be significant in a multivariate analysis when neither of the main effects were significant. I would therefore very much welcome some suggestions here too!
Many thanks
Best wishes
Margaret
Response 1
===========
I could easily see a reason to test for interactive effects even when
main
effects are not significant.
Take for example motor insurance and a model that tries to predict who
will
have an accident.
Consider and extreme example to make a point.
factor 1: two groups of cars one with the break pedal at a normal
distance from the driving seat and another car designed for people with
long legs.
factor 2: height - small and tall - splitting the population 50%/50%
If we further assume that cars are dished out randomly between tall and
small people, then of course there will be lots of accidents, however
you
will spot nothing when just looking at the main factors. For example
half
the small people crash due to being in tall peoples cars and vice
versa.
Likewise half of each lot of cars crash due to being driven by people
with
the wrong length legs!
However the interactive effect will ofcourse tell everything!
Slightly exaggerated but I could imagine this in real life too.
I'd be interested to hear what field you are using these models in.
Response 2
==========
Here's an example that convinced me...
A characteristic is measured at baseline; subjects are then randomized
to treatment or control and subsequently the characteristic is measured
again at a second timepoint. It would not be expected that there would
be a significant difference between the groups at baseline, nor between
the first and second timepoints for the control subjects. But if the
treatment has an effect on the measured characteristic, then (assuming
the sample size is large enough) it would be expected to find a
significant treatment-timepoint interaction.
Interestingly, this is a study design that is often found, but the old
adage about not testing for interactions if the main effects are not
significant still prevails.
=============================================================
Response 3
===========
A possible approach to investigate here is the method of multi-factor
dimensionality reduction, which has its own web page at
http://www.multifactordimensionalityreduction.org/
and seems to be getting increasingly popular in the genetics sector
(rightly or wrongly).
_____________________________________________________________________
Response 4
===========
As you probably know, fitting an interaction without a main effect
violates
Nelder's marginality principle.
1. Nelder JA. A reformulation of linear models. Journal of the Royal
Statistical Society A 1977;140:48-77.
In commonsense terms I would say that one should not confuse importance
with significance. A main effect may be relatively small but it is
difficult to imagine it could be exactly zero if an interaction is
present.
Either leaving the main effect in will make little difference to
inference
about the interaction, in which case what's the problem, or it will, in
which case it is unwise to leave it out.
You may find the attached of interest. See in particular section 2.2.
[The reference for the attached article is:
Seen, S., “Added Values (2004): Controversies concerning randomization and additivity”, Statistics in Medicine, 23: 3729 – 3753.]
Response 5
==========
As a statistician with ample experience in psychology and medicine,
my advice is to first test interaction, and only if interaction is
absent, have a look at main effects. For instance, in comparing
psychotherapy with pharmacotherapy for treating depression, it may turn
out that averaged across severely and mildly depressed patients, there
is no therapy effect (main effect), whereas psycho is better than
pharma
for mild depression and worse than pharma for severe depression. Main
effects obscure such clinically important interactions, although I
concede that they may be rare.
In unbalanced designs or with other coding schemes than -1,+1 (e.g. if
we use regression with(0,1) coding of predictors), things are even
worse:
the p-values of the main effects are then arbitrary (i.e. dependent on
the coding used) as long as the interaction term is in the model. In
contrast, the p of the interaction term itself is invariant across
coding schemes as long as both main effects are included into the
model.
You may easily verify this on some dataset, by using (0,1) coding and
(-1,+1) coding and comparing p-values
(continued) P.s.
I realized somewhat late that your context may be different from mine.
With multiple factors in a factorial design, all of which are of
interest,
I guess you may often be forced to start with main effects, except for
a
few interactions based on prior theory. In that case, testing for
interactions only between those main effects which were significant,
makes sense in order to prevent highly complicated models.
In our context we typically have one factor of interest, e.g.
treatment,
and a series of 5 up to 10 covariates included for increasing power or
testing interaction with treatment. In that case, we usually start with
a model including all treatment by covariate effects, but not covariate
by covariate effects, and then reduce the model by deleting
non-significant interactions.
So it depends on the number of factors of interest whether starting
with
main effects only is better, or starting with main and (twoway)
interactions, I guess.
Response 6
==========
I was unaware of any lack of concensus. It is quite possible for two
variables (x1, X2) to be unrelated to a third (Y), but still have their
interaction (x1*x2) be related to Y. Therefore, you cannot judge the
presence or absence of an interaction based on the model without
interactions. I have attached sas code and results which demonstrate
this. Also this is a multivariable analysis, not multivariate, which
refers to multiple dependent variables.
Response 7
==========
It is generally accepted practice not to include an interaction in a model
unless the main effects are also present. However, that is not to say
that
the main effects need to be significant. If you think that an
interaction is
important, then if it is significant, it should be included in the
model.
By my reasoning, if you do this, then the main effects should be
included
also, whether or not they are significant.
At first sight, this might seem strange but it is just what we do with
the
mean and the main effects. If we think that main effects are important
and
they are also sifnificant, then we usually include a mean in the model
irrespective of whether or not it is significantly different from zero.
The
reason that we do not usually have to ponder over this is that
statistical
software usually fits a mean by default, whatever other terms are
specified.
Response 8
==========
1) Consider a graphical response surface, a 2-factor, 2-level one
for simplicity, in which the main effect of a (b-) is positive in a,
while the main effect of a(b+) is negative in a. The overall main
effect of a will be near nil - insignificant. Likely enough, same
for b. I have run across real data that does this, and the graphical
display is about the only way I or the client & students, could
understand it.
2) Moral, it is possible that main effects will be insig, while
interactions will be sig.
3) Geo. Box, verbally & I think in BH^2, says that if you have a
significant interaction, then you _must_ retain the main effect terms
in the model, for mathematical reasons. I've seen a case where
dropping the main effects caused weirdness, but can't recall the
exact conditions at the moment. So I take George's advice.
4) If I were doing what you're doing, I would let the significance of
the interaction drive the choice of retention, not the other way
around.
Response 9
==========
I cannot understand why there is lack of consensus .
Consider a 2X2 factorial (AXB) and the four treatment means
A1B1, A1B2, A2B1 and A2B2. A set of coefficients defining the
contrasts
of means equivalent to
o the main effect for factor A would be (-1 -1 1 1)
o the main effect for factor B would be (-1 1 -1 1) and
o for the AXB interaction is (1 -1 -1 1)
So if A1B1 = A2B2 = x1 say and A1B2 = A2B1 = x2 say
o the main effect for A is A2B1 + A2B2 - A1B1 - A1B2 = x2+x1 - x1 - x2
= 0
o the main effect for B is A1B2 + A2B2 -A1B1 - A2B1 = x2 + x1 - x1 -
x2 = 0
o the interaction effect is A1B1 + A2B2 - A1B2 - A2B1 = x1 + x1 - x2 -
x2 = 2 * (x1-x2)
A simple plot of this example may help illustrate the point
So you can't assume that a null main effects model is sufficient.
Response 10
The decision about what effects to look for in a model depends on many aspects, and it would not be sensible to make simple rules for all occasions. Knowledge of the underlying science will often guide what is sensible. If you are modelling a well-understood mechanism in terms of factors whose association with the outcome is generally known, there is little need to look at interactions between factors whose main effects prove unimportant unless the interaction is expected. This is the general approach in confirmatory clinical trials, where it is unusual even to see analysis of interactions between factors that are statistically significant, unless one is the treatment factor. But if you have no expectation of the associations, it may well make sense to look at all interactions, as is generally done in multi-factor agricultural experiments in novel research.
You need to bear in mind that statistical significance is of secondary importance. The size of the effects, and their relevance to the underlying science, should always come first.
It clearly is possible that an interaction can be important (scientifically and statistically) when one or both main effects appear not to be. But the same can be said of three-factor interactions in respect of the constituent two-factor interactions, and of four-factor interactions, and so on.
The second question is easier to answer. If an interaction is statistically significant, then the question of whether the main effects are significant is irrelevant. You still need to look at the scientific importance of the effects, which may be summarized in terms of main effects if the interaction effect is scientifically small, or in terms of a two-way effect. But the interaction test has answered the secondary question about the importance of both the factors from a statistical point of view.
Response 11
…it's not inconceivable that an interaction is
signif even if the main effects of its factors are not. Two caveats
however
* the appropriate test for whether the interaction significantly
improves model fit is the test for the combined main effects of A and B
as well as the AxB interaction, not just the interaction term
* I'm happy to try this where there are not many factors/explanatories
to be tested; if I were screening a lot of explanatory variables I
might restrict myself to just looking at main effects initially, in the
interests of sanity. Although an X-shaped interaction which is
significant without showing up in the main effects is conceivable (Is
it sensible in the case you're asking about?), more often I find
interactions where there is some indication of an effect (maybe not
significant, but with comparatively low p-values) in the main effects
of
one or both of the explanatories. So I guess if you are screening lots
of explanatories for closer analysis, don't set the threshold for
acceptance too high
Response 12
Finding neither main effect to be 'significant' has no bearing on the interaction.
Cell Mean
A1B1 8
A1B2 6
A2B1 6
A2B2 8
with a suitable sample size and residual variance shows why.
Excluding main effects ('significant' or not) when you have the interaction in the model ('significant' or not) violates marginality. That is seldom a good idea.
Response 13
I sense you may have already had some replies, possibly along the same
lines as this - but I was taught that main effects are not
interpretable
unless the interaction is non-significant - so, if there is reason to
look / test for an interaction, this should be done first. I have
encountered many examples over the years where the interaction was
significant, and one or both of the main effects were also significant
-
just as I have also found many situations where the interaction was
significant but neither main effect was significant. So I would wish
to
argue that you should reverse your policy.
"Pound to a penny" that you will get replies saying the exact opposite.
That is why statistics can at time be such an interesting philosophical
subject.
---------------------------------
New Yahoo! Mail is the ultimate force in competitive emailing. Find out more at the Yahoo! Mail Championships. Plus: play games and win prizes.
|