Hello everyone,
I would just like to ask a quick question about the Intra Class correlation in a test retest situation i.e. when we have a group of people who each provide
responses at 2 different time points. I have found that literature on such a topic is thin on the ground.
Hypothetically, say we have the following data for 8 people taken at 2 time points:
T1 T2
1.00 2.00
2.00 1.00
1.00 3.00
3.00 2.00
2.00 3.00
1.00 2.00
2.00 3.00
3.00 1.00
Am I correct in thinking that in order to test the null hypothesis that there is no difference between the time points that we could perform any of the
following and the answer would be the same?
1) Paired t-test (t=-0.509, p-value=0.626)
2) Repeated measures (in SPSS, if we choose Analyze>General Linear Model>Repeated Measures and input one
'within subjects' factor (time) with 2 levels) giving:
Tests of Within-Subjects Effects |
||||||
Measure: MEASURE_1
|
||||||
Source |
Type III Sum of Squares |
df |
Mean Square |
F |
Sig. |
|
factor1 |
Sphericity Assumed |
.250 |
1 |
.250 |
.259 |
.626 |
Greenhouse-Geisser |
.250 |
1.000 |
.250 |
.259 |
.626 |
|
Huynh-Feldt |
.250 |
1.000 |
.250 |
.259 |
.626 |
|
Lower-bound |
.250 |
1.000 |
.250 |
.259 |
.626 |
|
Error(factor1) |
Sphericity Assumed |
6.750 |
7 |
.964 |
|
|
Greenhouse-Geisser |
6.750 |
7.000 |
.964 |
|
|
|
Huynh-Feldt |
6.750 |
7.000 |
.964 |
|
|
|
Lower-bound |
6.750 |
7.000 |
.964 |
|
|
i.e. F=0.259 (t^2=-0.509^2), p=0.626.
3) Intraclass correlation (in SPSS, if we choose Scale>Reliability Analysis and in "Statistics" choose F test in the ANOVA table box) giving :
ANOVA |
||||||
|
Sum of Squares |
df |
Mean Square |
F |
Sig |
|
Between People |
3.000 |
7 |
.429 |
|
|
|
Within People |
Between Items |
.250 |
1 |
.250 |
.259 |
.626 |
Residual |
6.750 |
7 |
.964 |
|
|
|
Total |
7.000 |
8 |
.875 |
|
|
|
Total |
10.000 |
15 |
.667 |
|
|
|
Grand Mean = 2.0000 |
i.e. F=0.259 (t^2=-0.509^2), p=0.626.
Thus all of the above results tell me that there is no evidence to suggest that there is a difference (as regards mean value) between the two time points.
For 3) above is it still valid to quote the value of the ICC? I have only ever seen ICC quoted when we are
assessing inter-rater reliability (i.e.consistency or absolute agreement between different judges)
and never in a test-retest situation when the same person is measured twice.
If
we *can* quote
ICC in the test-retest situation what are the appropriate options as regards model (one way random, two way
random or two way mixed), type (consistency or absolute agreement) and
measures (single or average)? I would say if we are wanting to determine if the measures are exactly the same at the two time points then we would choose 'absolute agreement' and if we plan to use the results at a single time
point (rather than average over the two points) then I would choose 'single measures' but what about the model when we are dealing with the same subject measured twice (rather than two different 'judges')....one way random, two way random or two way mixed?
Many thanks in advance on this matter. I appreciate it.
All the very best,
Kim