JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for ALLSTAT Archives


ALLSTAT Archives

ALLSTAT Archives


allstat@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

ALLSTAT Home

ALLSTAT Home

ALLSTAT  2000

ALLSTAT 2000

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Summary: 120 subjects on 120 occassions

From:

"Dr. Hans-Christian Waldmann" <[log in to unmask]>

Reply-To:

Dr. Hans-Christian Waldmann

Date:

Tue, 17 Oct 2000 12:56:22 +0200

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (556 lines)



Dear List(s),

Last week I have posted a request for help pertaining to the issue
of how to analyse repeated measurement data with some rather unusual
dimensions (N=120 subjects giving a time series with T=120 occassions
each). To my pleasure, there have been as much as 12 replies which are
given below. Beforehand, the original posting is given.

For those of you who don't want to read through all of these, here is
a short summary.


Roughly, responses/recommendations can be classified into 5 categories:


 a) Make use of all information available and put them into a form
    suitable for state-space models or vector-ARIMA-analyses.

    Frankly: this seems to go beyond my capabilities, and I am 
    inclined to take the penalty of reducing the data as pro-
    posed by other responders.


 b) Perform time series analyses for each subject and boil the
    data down to certain parameters. Read these into a secondary
    data set and merge it somehow with the original one in order to 
    preserve design variables like treat/control, sex and age or the 
    like. Finally, use these data for "standard" analyses to test for 
    hypotheses of homogeneity of subjects within groups or differences
    across groups (implying some MANOVA-style model).

    Going to extremes, one could obtain a single parameter like a slope
    for each subject and perform univariate analyses with regard to
    higher stratum levels.

    Another response in this direction suggested fitting a spline
    model for each subject and use spline components for subsequent
    ANOVA-style models. I understand that this could be done using
    proc transreg in SAS, but I am not sure whether this procedure does
    in fact account for the time dependency in the individual data giving
    the spline.


 c) Reduce the repeated measurement frequency in the first place and
    then perform (M)ANOVA-style analyses with a time factor of (then)
    suitable level count. Test for time effects using standard contrast
    like polynomial decomposition or helmert coefficients (when interest
    lies with the point in time when responses cease to change any 
    further).


 d) A particuarly interesting response suggested identifying "change 
    profiles" within time series and submitting these to further analyses 
    like permutation test. Still, I am unclear about how to aggregate data
    in order to make best use of all subjects' data.


 e) General remarks and caveats like paying regard to sample size issues,
    looking for cyclicity in individual data that generalze to the stratum,
    adjusting for cross-correlations in case of multi-variable outcome
    measures, and the complexity of assumptions required when analysing
    complex factorial designs involving a repeated measures factor.



Again, thanks to all who took their time to help. I am committed to parti-
cipate in this way of mutual assistance.

Hans C Waldmann



---------------------------------------------------------------------
Dr. Hans C Waldmann              
Methodology & Applied Statistics in Psychology & the Health Sciences

ZFRF / University of Bremen / Grazer Str 6 / 28359 Bremen / Germany 
[log in to unmask] / http://samson.fire.uni-bremen.de

friend of: AIX PERL ADABAS SAS TEX 
---------------------------------------------------------------------









Following:
             0) Original posting (request)
          1-12) replies




0)
------------------------------------------------------------------------


----- Original Message -----
From: "Dr. Hans-Christian Waldmann" <[log in to unmask]>
To: <[log in to unmask]>
Sent: Thursday, October 12, 2000 2:18 PM
Subject: a model for time series (T=120) for N=120 persons ?


>
>
> Hello everybody,
>
> in one of the clinical projects we consult on data analysis, I am
> facing a problem I have not yet come across and that leaves me with no
> idea on how to proceed. The problem pertains to the dimension of
> the outcome data set. In a repeated measures design, let N be the
> number of people treated and T be the number of measurement occassions.
>
> I understand that N=1 (or _some_) more and T=120 would make up a time
> series, and that I am supposed to fit ARIMA-MOdels or Transfer functions.
> I could detect effects by structural breaks around the point of time of
> intervention, that is: performing intervention analyses as proposed in
> McDowell, McCleary, Meidinger and Hay, 1980, Interrupted time series
> analysis, or other books on how to analyse data from single subject
> designs.
> Allright.
>
> I understand that N=120 (or any number more) and repeated measures like
> 2<=T<="the-smaller-the-better" would make up a dataset suitable for
> an ANOVA approach or mixed models using special covariance structures
> like SAS's proc mixed. I know how to do that.
> Allright.
>
> I understand that for each of this variants there are some alternatives
> in statistical modeling (like non-parametric analyses etc.).
>
> Now, what am I supposed to do with data from a design giving a T=120
> time series for _each_ of 120 subjects ? There has been a controlled
> study where patients in three independent groups were asked to keep
> a diary on some outcome variables for ca. 4 months. There are some
> design variables like treat/control or sex and age that are expected
> to contribute systematically to variation between outcome measures.
> But this outcome measure apparently is a time series. I don't think
> I should perform an ANOVA-style analysis with a 120-level time factor.
> Pooling data and performing ARIMA/transfer-functions on a single time
> series of subjects' means for each point in time doesn't make sense
> either, assuming that subjects differ in both measurement level and
> covariance structure of their individual time series. I admit that
> I have no idea how to evaluate, say, an effect of treatment on this
> kind of outcome measure.
>
> Does anybody else have an idea ? I promise to post a summary of res-
> ponses to the list.
>
>
> Thanks in advance
>
> Hans-Christian Waldmann
>
>
>



1.
-----------------------------------------------------------------------

>From [log in to unmask] Thu Oct 12 15:34:49 2000


Hi

It may be easier in the long run to pose the time series in state space
form.  Then missing values are easy to deal with for a start and it can be
easier to model what goes on.

See papers by Durbin and Koopman, and software from Koopman (Ssfpack)

http://www.econ.vu.nl/koopman/ssfpack/

This does all the hard work to leave you to concentrate on modelling.  I
think there may be a book coming out soon from Oxford University Press

http://www.oup.co.uk/

Hope this helps.  I intend to have a go with this software when I can.

Robert West




2.
-----------------------------------------------------------------------

>From [log in to unmask] Thu Oct 12 16:25:09 2000
Status: RO

Dear Hans

The obvious way to proceed would be to analyse each of the 120 time 
series in some appropriate way, and then use the derived parameters 
in further analysis of the experiment overall.

In a simple example, where the derived parameter is, say, a slope 
from Linear Regression, the slope estimates can then be used as the 
response in an ANOVA or Multiple Regression analysis.  

Your time series can be de-constructed in a suitable way, perhaps 
using a breakpoint detection method, some ARIMA model parameter, or 
even a more complex method such as describing the time series curve 
using principle components.

Ones you have the derived parameters, these can then be modelled.

You should end up with model/s that determine or predict the 
effect of each of the experimental factors such as 
treatment/control, age, sex, group type (in combination) on all the 
derived parameters.  This should, in turn, lead to some standard 
values of the derived parameters that occur  under particular 
combinations of the factor settings.

Of course the simpler the derived parameters the better, and one must 
take care of correlations between these when establishing levels of
uncertainty around estimates of the 'standard' values. 

Where the effect of factors on the parameters conflict in some way, 
if this is possible, joint optimisation methods can be utilised.

The modelling techniques will probably involve response surface 
analysis.

So there are three stages:

Time series analysis - to find parameters
Parameter modelling - to find factor effects and predictive equations
Joint optimisation? - to resolve prediction-effect conflicts

Regards
Dave Stewardson



3.
-----------------------------------------------------------------------


>From [log in to unmask] Thu Oct 12 17:10:35 2000

Hi,

This sort of things occurs quite often in clinical trials when you have
diary data. The subject is on treatment for 12 weeks say and records
their lung function (say) every day just after they get up.

N=200 to 500 say and T=12*7=84

The crucial things is to get the client (medic) to state what it is that
interests him. Then an easy approach is to do a two stage analysis.
Within each individual analyse the data to produce a single summary
value. Could be mean, slope, maximum, minimum, area under the curve
(average), time when value drops to 50% of baseline. etc, etc. Which
summary statistic you choose is determined by what question the client
wants to ask of the data.

Then analyse these summary values across subjects at the higher stratum
level.

For regular data many of these analyses are special cases of mixed
models. (e.g. taking regression slope and analysing at higher level is
exactly the same as random-coefficient regression in mixed models when
the X values are the same for every subject.)

But first of all - plot the data!

James.



4.
-----------------------------------------------------------------------

From: "Gaj Vidmar" <[log in to unmask]>

What I can propose is rather simple, so it may well be completely wrong
(especially as no true expert has posted anything on the topic so far), but
perhaps it will be of some use:

why not pool data for an individual over time-periods - say, months, or to
preserve more information, weeks? (Perhaps not by averaging, but - depending
on data chracteristics - using median, geometric mean, or some fancy
M-estimator?)

- This will give you the possibility to conduct an ANOVA-type analysis -
mixed model with some "nonrepeated" factors (three fixed, if I get it right,
i.e., treat/control, sex and age, plus eventual others) and week (or
whatever time-period) as "repeated".

As emphasised in the introduction, this may be less than two cents.

Best regards,

Gaj Vidmar
Univ. of Ljubljana, Dept. of Psychology



5.
-----------------------------------------------------------------------

From: "Gaj Vidmar" <[log in to unmask]>

Dr. Waldman,

there seems to be no word from professional statisticians yet, so here's an
addenum.

Namely, I have overlooked two important aspects of the study; which,
hovever, doesn't invalidate the basic idea of pooling individual data over
appropriate time-periods.

The first aspect are the three groups of patients. - I'm not sure whether
they were formed on the basis of the (quote) design variables (in which case
there is one factor instead of the three nonrepeated ones), or they define
another factor (a rondom one, I guess, as opposed to the three fixed ones),
but the pooling approach is independent on this fact.

The same goes for the second aspect, i.e., that there were several measures
taken, not just one. Theoretically, MANOVA might thus be feasible instead of
several ANOVAs. But with such a complex model (say, one random plus three
fixed factors plus one repeated-measures factor) properly checking all the
various assumptions and interpreting all the results is rather ... Not to
mention that the analysis must be properly set up in the first place
(contrasts issues ...), as well as sample size issues ... At least, fully
understanding such an analysis is probably beyond the horizon of the
majority of the "consumers" in social/health sciences, to which you will
presumably have to present the findings. So if the outcome variables are not
too many and/or they are not too correlated, I believe they can be analysed
"one by one".

Awaiting judgement from the sci.stat.* community and wishing you all the
best with the research,

Gaj Vidmar





6.
-----------------------------------------------------------------------

From: MJ Ray <[log in to unmask]>

My own suggestion (mangled by a bad emailer) was to use vector time
series methods, but this could lead to a fairly large computation
probelm without extra information.  I wasn't able to recommend a very
good specialist reference off the top of my head, though.

MJR



7.
-----------------------------------------------------------------------

From: Elliot Cramer <[log in to unmask]>


you havn't really given enough information but here is a suggestion.  you
have three separate groups.  If they are not the treatment groups with
random assignment, anything else you do will be VERY dubious.  You could
use sex as a blocking factor and age as a covariate.  What is the purpose
of the 120 observations? You could construct a SMALL number of relevant
variables from these observations and do a MANOVA, for example linear,
quadratic and cubic trends if you are simply interested in what happens
over time.  You might also do a between groups analysis on the final time
or average time.  It's hard to say without knowing the details.

What your REALLY should do is consult a statistician about the specifics.



8.
-----------------------------------------------------------------------

From: [log in to unmask] (Magill, Brett)

I don't know enough about time series really to provide much advice.
However, I have seen methods by which a slope was calculated across time for
each subject with the first measurement as the incercept (within subjects).
Subsequently, the individual slope was regressed on other factors.  Thus,
answering the question what factors (X) influence the rate of
change/direction across time in Y.



9.
-----------------------------------------------------------------------

From: [log in to unmask] (Simon, Steve, PhD)

Even though the researchers collected data on 120 consecutive days, I doubt
that they are particularly interested in any one day in isolation. Look at
some composite measures, such as the slope of the trend line, or the change
score at the end of each month. Or perhaps an average for each month, or the
standard deviation for each month.  Your researchers should be able to 
elaborate on why they collected the data, and that elaboration should help 
you decide which composite measure you should use.

Once you reduce it to a small number of composite measures, then you can
apply the ANOVA types of procedures.

An alternative that might be worth exploring is fitting a spline model to
each subject's data and then pooling the splines across groups. This is
messy and complex, but fun.

I hope this helps. Good luck!



10.
-----------------------------------------------------------------------

From: Rich Ulrich <[log in to unmask]>


Steve tells how to make the best of the data, making the likely
assumptions about the 120 days -

You don't say where the 120 days exist, so it might be that the are
paycheck cycles of 7, 14, 28 days, or a month; or menstrual cycles, or
some other.   If the subjects have some overlapping '120 days' on the
calendar, it might be reasonable to look at calendar-date for cycles,
or for extreme events.  That's assuming, there is a bit of day-to-day
lability that might cover up some information.

But if you aren't looking at (say) muggings on the day after Social
Security Checks appear, then I doubt that cycles are likely.  Still,
the detail does allow you to exam on-set  variations, or off-set --
That is, there might be a definite curve over the first week or so
that does not exist later, if the ratings are something that entail
learning or adaption.  - This could be something interesting if it
varies among the three groups, or it could be something to be
eradicated because it is artifact.

On the other hand, if the 120-days was known to be a limit, there
might be some  'anticipation of the end' -- for instance, patients in
hospitals may show remarkable recovery  during the last week of the
insurance coverage.  

So, you can probably lump data by weeks or months, but don't forget to
take a look at start- and end-effects.  If the measures have those
hazards.

-- 
Rich Ulrich, [log in to unmask]
http://www.pitt.edu/~wpilib/index.html



11.
-----------------------------------------------------------------------

From: MJR  	http://stats.mth.uea.ac.uk/

Seriously, you need to be looking at multivariate or vector time
series methods in this case.  Unfortunately, without adding more
assumptions (hopefully reasonable ones created from expert opinion) to
the model, you are looking at a fairly large computation problem (many
cross-correlations) at least, I think.  I shan't say more for fear of
making a fool of myself at this time of the morning, but suggest the
weighty "Time Series Analysis" by Hamilton as a possible lead.


12.
-----------------------------------------------------------------------

From: David Carr


You are right that pooling the 120 people into one time series is 
not the correct solution (average behaviour is such a context is 
close to meaningless----a former colleague of mine, Dr. Wolfgang 
Keeser showed this several years ago, with time series of 
cigarettes smoked for people being treated to give up smoking).

I think one of the better analysis strategies would to model for 
each person  an intervention (interrupted) time series model 
made popular by Box and Tiao. With the resulting parameters one 
could classify patients to those with no change, only a transient 
change, and those with a permanent change (either positive or 
negative). With this classification one could then look into the 
possible influence of other covariates.

As far as the time series modelling goes, I would be glad to help 
you out, if you happen to have any funding.

If on the other hand, you need to do it yourself, I could send you a 
bibliography (or part thereof) of mine on time series topics (some 
1500 references in all).

Look forward to your reply.



12.
-----------------------------------------------------------------------

From: [log in to unmask]


Lieber Hans (ich glaube es ist angebracht auf deutsch zu antworten),
ich hatte vor ein paar Monaten mit einem ähnlichen Problem zu tun, als wir
versuchten die Effekte
von verschiedenen Reizen auf eine physiologische Response Variable (RR
Intervalle, Pulsamplituden etc.) zu quantifizieren.
Wir wussten a priori so gut wie nichts, weder wie stark die Effekte der
einzenen Reize seien, noch wann nach dem Reiz der Effekt messbar ist.
Deshalb wurde pro Patient in sehr kurzen Abständen gemessen. Wir hatten es
somit mit einer Menge längererer Zeitreihen zu tun.
Nach langwieriger  Literatursuche bin ich im Biometrics auf eine
interessante Arbeit eines ungarischen Kollegen gestossen,
der ein nichtparametrisches Verfahren zur Analyse von RM Problemen
entwickelt hat:
J.Reiczigel: Analysis of Experimental Data with Repeated Measurements.
Biometrics 1999; Vol 55.,No.4:P 1059-1063.

Kurz: die Methode sucht in den einzelnen Verlaufsprofilen Abschnitte von k
konsekutiven Punkten, die sich vom Rest deutlich unterscheiden (sog.Top
Periods). Mittels eines Permutationstests wird überprüft, ob das Auftreten
dieser Perioden systematisch (Effekt) oder zufällig ist.

Ich habe mit dem Autor Kontakt aufgenommen, dieser stellte mir
freundlicherweise seine selbst entwickelten S-PLUS Programme zur Verfügung.
Weiters haben wir den Datensatz gemeinsam (!) analysiert. Es bietet sich an
mit dem Autor Kontakt aufzunehmen, da er sehr an RM Problemen interessiert
ist, vor allem am Verhalten seiner Methode an realen Daten.

Seine Email Adresse lautet: [log in to unmask]

Wenn Du Fragen hast mail bitte zurück.

Ich hoffe das hilft Dir weiter

Liebe Gruesse aus Graz
Bernd

  



%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

April 2024
March 2024
February 2024
January 2024
December 2023
November 2023
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
December 2022
November 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
2006
2005
2004
2003
2002
2001
2000
1999
1998


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager