Hello everyone,

I would just like to ask a quick question about the Intra Class correlation in a test retest situation i.e. when we have a group of people who each provide responses at 2 different time points. I have found that literature on such a topic is thin on the ground.

Hypothetically, say we have the following data for 8 people taken at 2 time points:

T1 T2

1.00 2.00
2.00 1.00
1.00 3.00
3.00 2.00
2.00 3.00
1.00 2.00
2.00 3.00
3.00 1.00

Am I correct in thinking that in order to test the null hypothesis that there is no difference between the time points that we could perform any of the following and the answer would be the same?

1) Paired t-test (t=-0.509, p-value=0.626)

2) Repeated measures (in SPSS, if we choose Analyze>General Linear Model>Repeated Measures and input one 'within subjects' factor (time) with 2 levels) giving:

Tests of Within-Subjects Effects
Measure: MEASURE_1
Source		Type III Sum of Squares	df	Mean Square	F	Sig.
factor1	Sphericity Assumed	.250	1	.250	.259	.626
	Greenhouse-Geisser	.250	1.000	.250	.259	.626
	Huynh-Feldt	.250	1.000	.250	.259	.626
	Lower-bound	.250	1.000	.250	.259	.626
Error(factor1)	Sphericity Assumed	6.750	7	.964
	Greenhouse-Geisser	6.750	7.000	.964
	Huynh-Feldt	6.750	7.000	.964
	Lower-bound	6.750	7.000	.964

i.e. F=0.259 (t^2=-0.509^2), p=0.626.

3) Intraclass correlation (in SPSS, if we choose Scale>Reliability Analysis and in "Statistics" choose F test in the ANOVA table box) giving :

ANOVA
		Sum of Squares	df	Mean Square	F	Sig
Between People		3.000	7	.429
Within People	Between Items	.250	1	.250	.259	.626
	Residual	6.750	7	.964
	Total	7.000	8	.875
Total		10.000	15	.667
Grand Mean = 2.0000

i.e. F=0.259 (t^2=-0.509^2), p=0.626.

Thus all of the above results tell me that there is no evidence to suggest that there is a difference (as regards mean value) between the two time points.

For 3) above is it still valid to quote the value of the ICC? I have only ever seen ICC quoted when we are assessing inter-rater reliability (i.e.consistency or absolute agreement between different judges) and never in a test-retest situation when the same person is measured twice.

If we *can* quote ICC in the test-retest situation what are the appropriate options as regards model (one way random, two way random or two way mixed), type (consistency or absolute agreement) and measures (single or average)? I would say if we are wanting to determine if the measures are exactly the same at the two time points then we would choose 'absolute agreement' and if we plan to use the results at a single time point (rather than average over the two points) then I would choose 'single measures' but what about the model when we are dealing with the same subject measured twice (rather than two different 'judges')....one way random, two way random or two way mixed?

Many thanks in advance on this matter. I appreciate it.

All the very best,

Kim