JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for ALLSTAT Archives


ALLSTAT Archives

ALLSTAT Archives


allstat@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

ALLSTAT Home

ALLSTAT Home

ALLSTAT  1999

ALLSTAT 1999

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

replies to logistic model request

From:

"Simon Williams" <[log in to unmask]>

Reply-To:

Simon Williams

Date:

Fri, 20 Aug 1999 09:12:06 PDT

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (307 lines)

Hello all,

A few weeks ago I posted a request re: logistic regression to the list and 
I've attached the responses below (apologies if I've left anyone out).

The responses include using survival analysis as opposed to logistic 
regression, suggestions to fine tune what the objective of the work is and 
using a multi-level model approach when undertaking logistic regression.

Having just visited the BMJ site (http://www.bmj.com) and read the paper by 
Spiegelhalter et. al.,  a predictive model using a bayesian approach may 
also prove useful i.e. you already know whether someone has had a heart 
attack, useful prior information?

Thanks to all those who replied.

Simon

Simon Williams
Research Fellow
Department of Anaesthesia
Level 7
Bristol Royal Infirmary
BS2 8HW
Tel. 0117 9283169
Fax. 0117 9282098

<<<<<Original request>>>>>>>>>>>

Hello all,

A colleague is investigating the possibility of
using a logisitic regression model to predict whether a
patient will have a heart attack in the future. Some of the predictive
variables are age, sex, current cholesterol level, etc.

However he would also like to include the number of previous heart
attacks as a predictive variable. Although i think it is right to
use such information would there not be a problem with the fact that
the outcome and predictive variables are 'related'?

If you have any thoughts or suggestions for papers/books dealing with
this subject i'd be very grateful if you post them directly to me.
I will post a summary of the replies to the list at a later date.

Thanks,

Simon

<<<<<<<<<<<replies>>>>>>>>>>>>>>>>>>>>

From: Linda Hunt [log in to unmask]

It may be OK to use logistic regression if you are talking about the
risk over a specified period, say for example over the first year after
some surgical procedure...otherwise the risk will be related to the
length of time the patient is being followed.

Have you considered using survival techniques, eg Cox P-H? You could
maybe include a stratification by previous heart attacks (some/any,
number??), since the hazard rates may be different.


From: "D Wright" [log in to unmask]

As I understand it, you can include previous heart attack as a
covariate.  However, there may be problems of interpretation.  If the
effect of other covariates, for example cholestoral level, are
closely associated with previous heart attacks then the analysis may
show no cholstoral level effect.  This sort of issue arises in
survival data analysis with time dependent covariates.  A good
discussion is given in the book by  Kalbfleish and Prentice,  The
Statistical Analysis of Survival Data,  pages 124-126, under the
section on internal covariates.  If your aim is prediction then this
should not be of any  importance and I would think previous
history of  heart attack is a very useful predictor of future heart
attacks.


From: T R Harris [log in to unmask]

The answer may depend on whether the goal of the research is primarily
"predictive" or "analytical."

If the goal is predictive (i.e., simply to predict whether or not a
patient will have a future heart attack) then I see no problem at all in
the inclusion of previous heart attacks in the model.  In fact, it's
probably very desirable to include this variable if it does in fact add
substantially to the ability to predict correctly.

If the goal is analytical (to learn something about the causal mechanisms
relating age, cholesterol, etc., to heart attacks) then I would worry
about direct and indirect causal mechanisms.  For example, age may affect
future heart attacks directly and also indirectly through its effects on
other predictors in the regression model.  Do you want to know about the
direct effect or the indirect effect?  And "direct" and "indirect"  are
themselves relative to the selction of predictors in the regression model.
Thus you need to think carefully about the causal ordering among the
variables in the model (and maybe about some unmeasured variables as
well).  If you were doing linear regressions, you could look at path
analysis as the conceptual framework, or perhaps structural equation
models if you want to think explicitly about latent (unmeasured variables)
(but SEMs are probably overkill in your situation).  I don't know how
logistic regression changes the thinking, but I think the essential
concepts (direct and indirect causation, for example) are unchanged
although the mathematical details are no doubt different.  Sorry that I
don't have references at hand, but the topics I would look for are path
analysis, causal analysis, causal models, intervening variable, antecedent
variable, mediator, spurious correlation.  Earl Babbie, The Practice of
Social Research, may have an introductory discussion of the key issues
(using social science examples).


From: "Duncan Smith" [log in to unmask]

I'm not sure if I'm missing something; but aren't they are bound to be 
'related'
if the model is correct?  I don't see any obvious problem.  Maybe your 
concern
is over the possible near multicollinearity of the predictors.  I wouldn't 
worry
about that very much unless I was trying to get precise parameter estimates 
(eg.
in econometric models).  You want a predictive model, so you can probably 
live
with it.

From: "Scott, Martin {TD-B~Mannheim}" [log in to unmask]

Hello Simon,

I don't think you have any problem with including the term as a variable in
the logistic regression modelling.  After all, if you didn't any of the
variables were related to the dependent variable, then you wouldn't do any
modelling in the first place.  The number of previous heart attacks is
indeed a candidate to be a prognostic factor for future heart attacks.  I
would be tempted to include the variable as a simple binary (had previous
heart attack / no previous heart attack) response.

Hope that helps.

Martin


From: [log in to unmask]

Sounds suspiciously like you are not useing statical correlation as it
might be.  If previous heart attacks are input, and future one is output,
you _want_ correlation.  difficulty would be if current cholesterol level
was always high for previous HA's.  - that would be an covariate relation
you _don't  want.

Besides, what make syou think that previou8s HA's predict future ones?
Must be some data there already.

Good luck,
Jay Warner, Principal Scientist





From: Chris Sutton [log in to unmask]

Simon,
          You might find Weinberg, C.R. 'Towards a clearer definition
of confounding', American Journal of Epidemiology, 1993, 1-8 useful
and the recent book Woodward, M,
'Epidemiology: study design and analysis', Chapman and Hall, 1999 has
a good section on the definition of confounders.  Hope this is useful

From: [log in to unmask]

Dear Simon,

Using previous heart attacks might look good at first glance but you would
have to define the seriousness of the previous attack.

I have come upon many individuals, who upon have an EKG taken, show that
there has been damage to the heart previously.  When questioned, they often
reply that they have never had a previous attack.  It seems that many heart
attacks are never identified as a heart attack at the time they occur.  If
you use 'previous damage' to the heart you can count it as a heart attack 
but
you can't be sure if it represents one previous attack or a number of
previous attacks.


From: Paul Seed [log in to unmask]

A bit meaningless unless all patients are followed for a fixed time
following the measurement of predictive variables.
More common is to carry out a survival analysis using the time to event.
In either case, what is he to do about multiple heart attacks?
Surely not.  If only one event is recorded per subject, it is related to
previous MIs only in that past events predict future events.
Multiple event survival analysis is also possible, but this is a more
conplicated issue.

From: Patrick McElduff [log in to unmask]

Simon,

I have done similar types of analyses. One of the purposes of logistic
regression is to describe the relationship that exits between the
independent variables and the dependent variable. The problem with your
analysis is not that the previous number of heart attacks is 'related' to
the risk of having another heart attack. Your problem is more likely to
result from the fact that your independent variables may be 'related', in
particular the number of previous heart attacks might be highly correlated
with age.

A good book on this subject is "Applied logistic regression" by Hosmer and
Lemeshow.

regards Patrick

From: [log in to unmask]

Simon I would have thought you expect the other variables like age to be 
related
to the outcome variable otherwise you will not be able to fit a model that 
includes them. I think what your saying is that the number of strokes are in 
some way the same measurement or a function thereof.  I would say you are 
alright to proceed, although you might find the number of previous strokes 
makes a large contribution to the explained variance.  You could also 
consider fitting separate models for no previous,
1 previous, more than 2 previous strokes ... this might be more informative.

Just my two cents worth. Dave Collins (Univ of Reading) has a good book on 
logistic regression, I think it is called Modelling Binary Data.

From: [log in to unmask]


In response to your allstat question I don't see why there should be a
problem with the explanatory and response variables being related. If they
weren't 'related' somehow, why would you use them as explanatory factors
????

Also I suspect that you will have very few subjects having 2, 3, 4 previous
heart attacks, so might want to categorise the variable into previous heart
attacks yes/no, or failing that none, one, more than one.


From: Southworth Harry H [log in to unmask]

There are various models in existence that do this sort of thing.
Have a look at Hingorani, AD et al, BMJ, Vol 318 (1999).
They use "logistic regression", but I think it's a proportional odds
model that they use.

I've modelled the EAS risk tables (Wood, D et al, European Heart
Journal, 19, 1434-1503 (1998) using a proportional odds model,
and it works rather well.

There are several models which are referred to as "The
Framingham Equation" which also model risk. See Wilson, WF
et al, Circulation, 97, 1837-1847 (1998), and Anderson, KM et
al, Circulation, 83, 356-362 (1991).

I suggest that in the logistic regression model you try taking
logs of all explanatory variables and see if this improves the fit.
You might also find that log(age)*log(age) is significant, as
well as the interaction between log(SBP) and log(age).

I have no experience of using number of previous heart attacks
as a predictor. People who have already had heart attacks, or
who have established heart disease or a family history of
heart disease are automatically assumed by medics to be at high
risk of suffering a future coronary event. I have not seen any
statistical evidence to back this up.

Another source of info you might like to look at will be the National
Cholesterol Education Programme (NCEP - I think that's what it
stands for). You should find them on the internet quite easily. I
don't have a proper reference for you.


From: Rita Campos [log in to unmask]

Dear Simon,

Your research seems  to me to be ideally suited for some multi-level
modelling.  Try checking out the www page of the Institute for education
(www.ioe.ac.uk).

The adavantage of multilevel modelling for logistic regression, is that you
can control for variables that may be multicollinear with your dependent
variable (i.e. prior heart attack and subsequent probability of another).
Hence, at level 1 you can control for the patients prior state of health.

Although the work that I refer to at the IoE is not medical, it
nevertheless encounters the same related methodological obstacles (e.g. a
child's prior attainment at age 13 and their GCSE results.)

I did my master's dissertation on multilevel modelling and standard
logistic regression models for education, to compare the accurary of the
estimates, if you are interested I can send you a copy.


______________________________________________________
Get Your Private, Free Email at http://www.hotmail.com


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

May 2024
April 2024
March 2024
February 2024
January 2024
December 2023
November 2023
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
December 2022
November 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
2006
2005
2004
2003
2002
2001
2000
1999
1998


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager