Greetings
I would greatly appreciate any suggestions on this problem:
I have a list of 1400 compositions rated twice each (rateA, rateB), by two
different raters (raterA and raterB) on a scale 0-15. The raters are
randomly chosen from a pool of 120, so they randomly function either as
raterA or as raterB (not that it matters, as they do not see each other's
marks and they do not note anything on the papers).
I want to check inter-rater reliability to see if they apply the rating grid
that they are provided with homogeneously.
I was thinking of Pearson's r (which, by the way yields r=.515, n=1400,
p<.001 two-tailed).
Still, I read that the Pearson correlation coefficients are calculated for
one pair of judges at a time, whereas I have different pairs -regardless if
individual raters appear more than once either as raterA or as raterB.
Does that mean that I must adapt a different index? (e.g. categorize the
results and employ Cohen's kappa or Cronbach's alpha?)
I do not even think that partial correlations would do the work, as I do not
have any other variables apart from the raters.
For reference the data are arranged as follows:
Composition RaterAID RateA RaterBID RateB
1 1200 12 1215 10
2 1200 06 1215 07
----------------------------------------------------------------------------
-------------
134 1256 04 1233 05
135 1256 11 1233 08
----------------------------------------------------------------------------
---------------
250 1215 02 1288 03
251 1215 11 1288 15
----------------------------------------------------------------------------
-----------------
950 1298 05 1200 02
951 1298 15 1200 14
etc.
I've been whacking my mind for several days now, but the more I read, the
more I get confused.
If someone can help, he or she will have done me a great favour.
Thank you
Vassilis
|