JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for ALLSTAT Archives


ALLSTAT Archives

ALLSTAT Archives


allstat@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Monospaced Font

LISTSERV Archives

LISTSERV Archives

ALLSTAT Home

ALLSTAT Home

ALLSTAT  1999

ALLSTAT 1999

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

regression Responses

From:

"Bruenning, Jason" <[log in to unmask]>

Reply-To:

Bruenning, Jason

Date:

Wed, 29 Sep 1999 09:04:17 -0500

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (361 lines)

Sorry about the attachment - responses are included below...

Thanks to all that replied to my posting...most replies offered reasons as
to why the main effect must be included in the model if the interaction term
is to be kept in...

I have attached a text file containing all the replies for those who are
interested (seemed to be many who are).. My original posting is also
below...

Thanks, again. Jason


Hello All,

I have a question regarding interaction terms with multiple regression using
happenstance data.

I have created interaction columns by multiplying to independent factors
together. When I run the regression, the interaction effect comes out
significant (p-value .002) but both of the main factors were not.

My question is if I must include the main effect terms if I am going to keep
the interaction term in the model?

Example:

Factor A Pvalue of .400
Factor B Pvalue of .062
Factor AB Pvalue of .002

Do I need to keep factor A in the regression model?






Hi Jason ...

This is a question I've long been interested in, and have never found a
satisfactory answer. Everyone says that if you keep the interaction term
in the model you muyt always also keep the main effects, but I've never
got a really satisfactory explanation as to why.

Here's an example where there would seem to be good reason *not* to keep
the main effects ...

A two by two experiment: Subjects are divided into two groups and
measured (with respect to some outcome of interest) at baseline. One
group is then given some treatment, the other not (control group), and all
subjects are measured again (time 2).

The model is ....

Outcome = overall mean + group effect + time effect + group-time
interaction + error

Clearly (by the fact of the randomization) there can be no group effect
nor time effect. The only effect can be a group-time interaction. That
is, the only effect can be on those subjects in the treatment group and
then only at time 2. So why must the main group and time effects be kept
in the model?

Please send copies of any other replies you get.

Andy Dunning

--------------------------------------------------------------------------
Andrew J. Dunning
Department of Biostatistics
University of Washington
--------------------------------------------------------------------------


The general answer is yes, you have to have main factors in a model which
includes their interactions. The significant interaction term tells you
that a model with interactions is in some sense better than one with just
main effects. What you need to do is to understand the structure of the
interaction, so as to find out how differences in one factor affect
differences in the other. This could have any sort of pattern. You need to
examine the two-way tables of means, and their standard errors, to find out
what is going on in your particular data set.

Dr Brian G Miller
Head of Statistics,
Institute of Occupational Medicine
8 Roxburgh Place, Edinburgh EH8 9SU, UK
Tel: +44 (0)131 667 5131
Fax: +44 (0)131 667 0136
e-mail [log in to unmask]

Dear Jason,
in general when you include an interaction of n-th order in a multiple
regression model you MUST include all the interaction term of lower order
and all the main effect involved in the interaction analysis.
In your case you are dealing with a "simple interaction" (first order)
involving just two factors so both should be included.

The reason is very simple :

Your model without interaction term is the following :

Y = b0 + b1*X1 + b2*X2

By such a model you are able to estimate the independent "weight" of each
factor Xi in determining the value of Y;
the underlying assumption is the absence of interaction (the weight of X1
is the same at each level of X2);

You can test such assumption including into the model an interaction term
X3=X1*X2 such that :

Y = b0 + b1*X1 + b2*X2 + b3*X1*X2

Such model can be written, also, in a different manner.

For example let's suppose to analyze the effect of X1 at each level (k=2
for simplicity) of X2 :

If X2=0 Y=b0 + b1X1

If X2=1 Y = b0 + b1X1 + b2 + b3X1
= (b0 + b2) + X1(b1+b3)

Take a look at b1 :
it is, now, the regression coefficient for X1 just in a subgroup (X2=0);
in the other subgroup the regression coefficient for X1 is b1+b3.

In real terms :
- your regression coefficients suggest the presence of interaction between
X1 and X2
- probably X1 (your factor A) is clearly not significant in absence of
factor B (that is b1 not significantly different from 0), but it becomes
when factor B is present (b3 strongly significant and b1+b3 strongly
different from b1).


In conclusion :
the inclusion of an interaction term in a regression model tests,
implicitly, the assumption of additivity effects of the main factors; if
the interaction reaches the significant level the assumption does not hold
and the overall effect is greater than a sum of simple effect among main
factors.


-----Messaggio originale-----
Da: Bruenning, Jason [SMTP:[log in to unmask]]
Inviato: martedi 28 settembre 1999 23.17
A: [log in to unmask]
Oggetto: Multiple Regression Interaction Terms


I think the essence of the problem is in the interpretation of the
model. If the interaction term is significant, then you will have to
focus your interpretation on the interaction between the two factors.
The main effects, be them sig. or not, should become of secondary
interest. Perhaps explain how the relationship among the few levels in
factor B changed at different levels of A.

Hope it helps.

Edmond.


Bruenning, Jason wrote:

> Hello All,
>
> I have a question regarding interaction terms with multiple regression
> using
> happenstance data.

The fact that you mention happenstance data tells me you already know
the hazards of it. Keep both eyes open :)

> I have created interaction columns by multiplying to independent
> factors
> together. When I run the regression, the interaction effect comes out
>
> significant (p-value .002) but both of the main factors were not.
>
> My question is if I must include the main effect terms if I am going
> to keep
> the interaction term in the model?

Yes. Strange things happen if you don't. The fit of the total model is
improved, so go with it.

> Example:
>
> Factor A Pvalue of .400
> Factor B Pvalue of .062
> Factor AB Pvalue of .002
>
> Do I need to keep factor 1 in the regression model?

First, let's check that you are doing what you think you are.

a) Did you rescale fractors A, and B so taht the product is not
biased? This can be compensated in a good software package, but if you
do themultiplication yourself, maybe not. Rescale both A and B to A'
and B', such that the average of each is 0. Then multiply for the
product, and add it to the model. Still a help?

b) Are A and B orthogonal to one another? If not, do some graphing
of factor locations, such as a plot of A vs. B. Does the product AB
tend along an axis/direction? Be very catious in your conclusions, if
so. I've done 3-D plots, of the points of A, B, and the product.
Fascinating!

c) Can you select data that is orthogonal in all the factors you may
care about? Do the analysis with this data, see if it will predit the
remaining data.

d) Try a 3-D plot of A, B and the response. If you can't see the AB
effect, hmmm.

e) If your data and conclusions withstand these checks, then I
predict that you will find in item (d) that the surface in the factor A
direction is sharply twisted - in front (B low) it will steeply
increase with A, in the back (B high) it will steeply decrease. Net,
factor A effect is small, with a large p. But the interaction can still
be large, with small p.

Jay
--
Jay Warner
Principal Scientist
Warner Consulting, Inc.
4444 North Green Bay Road
Racine, WI 53404-1216
USA

Ph: (414) 634-9100
FAX: (414) 681-1133
email: [log in to unmask]
web: http://www.a2q.com

The A2Q Method (tm). What do you want to improve today?
 


Jason,

you have to include the main effect terms in the model if you want to model
the
interaction effect. If not explicitly included, the main effects will still
be
implicitly present in your model because they are included in the
interaction
term.

I hope this answers your question,
Jerry

"Bruenning, Jason" wrote:

> Hello All,
>
> I have a question regarding interaction terms with multiple regression
using
> happenstance data.
>
> I have created interaction columns by multiplying to independent factors
> together. When I run the regression, the interaction effect comes out
> significant (p-value .002) but both of the main factors were not.
>
> My question is if I must include the main effect terms if I am going to
keep
> the interaction term in the model?
>
> Example:
>
> Factor A Pvalue of .400
> Factor B Pvalue of .062
> Factor AB Pvalue of .002
>
> Do I need to keep factor 1 in the regression model?



I think there are reasons for keeping the single variable in and
for leaving it out. One reason for leaving it out is gaining degrees of
freedom for the error term in testing coefficients. However, if it is the
case that for a response model y = b0 + b1*x1 + b2*x2 + b3*x1*x2 (where you
want to eliminate variable x1), when x2 is set to 0, the response is not
independent of x1, as would be assumed if x1 were eliminated, then x1 must
stay in. (That is, when x2=0 you assume the model to be y = b0, independent
of x1, if the b1*x1 term is removed).
Other reasons for keeping x1 in include the possibility that x1,
x2 and x1*x2 are highly correlated, and thus omitting one creates an
"omitted predictor" problem (see a regression text for more information).
In general, deleting a predictor based on its p-value capitalizes on chance,

especially if you think the predictor is important. There could be other
reasons its p-value is high.
For some models, like response surface models, there is a physical
reason for keeping all terms of a particular order (and orders lower)
in the model. Models that do not include lower order terms are
nonstandard, but can exist. It depends on the circumstances of the
problem.

Laura Thompson

Hi Jason

>From your figures, there seems to be some evidence for factor B, albeit
weak evidence. This makes me wonder what your modelling strategy
was: did you test the factors individually, or put them in the model all
at once, with factor A going last?
Apart from this, the answer would depend on whether you were doing
a straight linear multiple regresion or log-linear modelling. In the latter

case I understand that it would be better to leave the main factors in,
in the former to only use the significant factors.

Regards
Miland Joshi (Mr.)


I hope you'll post the replies to the list. My feeling is that in general
hierarchical models are preferred. ie those that include the non-sig main
effects, unless there is very good theoretical reason that the main effects
should be forced to zero. However I expect there are differing views (as on
many things). Will the choice make practical differences in your situation.?


Paul Marchant
Leeds Meropolitan Univ.

There is no contradiction in this. The problem is that you test every
definable
hypothesis in town. Do not exclude any terms from the model. Lack of
significance
of a null-hypothesis does not mean that the tested parameter is equal to
zero.
The conclusions drawn from an analysis with hypothesis testing are as per
textbooks
only when a single (one) hypothesis is tested.

Nick Longford
DMU Leicester

On Tue, 28 Sep 1999, Bruenning, Jason wrote:

Dear Jason!

Mainly multiple regression models are considered to be hierarcal, that
means if an interaction effect is included the main effects must be
included. A model containing only the interactions would not be valid in
this case.

I hope this helps
Peter




%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

May 2024
April 2024
March 2024
February 2024
January 2024
December 2023
November 2023
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
December 2022
November 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
2006
2005
2004
2003
2002
2001
2000
1999
1998


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager