My experience would suggest that there is a lot of danger
with confusing words, since the UK and US seem to use the
same words for different things.
I would agree that the correlation of question scores to test
scores is a good measure of a question's usefulness, and the
significance of the correlation is also very useful in
interpreting this.
But it's misleading to suggest that this is better than
discrimination, since in the UK, discrimination is used
sometimes as a synonym for correlation. For example in
Question Mark Designer for Windows, we calculate the Pearson
correlation and present this as the discrimination. And
this is commonly done by others as well.
But our US customers complained that they thought of discrimination
as being something else, and preferred us to call it correlation.
We now simply call this correlation in Question Mark Perception.
For a description of what US psychometricians call discrimination, see:
http://www.qmark.com/perception/help/v2manuals/perceptionreadme25server.html
#Stats
Another word which seems to have different uses across the atlantic
is "p value". The P value in the US is similar to our facility/difficulty,
but in general statistics, p value can also be the significance of the
correlation.
Summary: yes, correlation and significance of correlation are excellent
measures, but it's misleading to say they are better than discrimination.
Since for most people in the UK, they are the same.
I am not a statistician, please correct me if I'm wrong with
any of the above.
John Kleeman MA MBCS C.Eng ([log in to unmask])
Managing Director, Question Mark Computing Ltd
tel +44 (0) 20.7263.7575
direct +44 (0) 20.7561.5303
fax +44 (0) 20.7263.7555
web http://www.questionmark.com
John Kleeman MA MBCS C.Eng ([log in to unmask])
Managing Director, Question Mark Computing Ltd
tel +44 (0) 20.7263.7575
direct +44 (0) 20.7561.5303
fax +44 (0) 20.7263.7555
web http://www.questionmark.com
-----Original Message-----
From: This list, hosted by the CASTLE project, is for those interested
in conduct [mailto:[log in to unmask]]On Behalf Of
Jon Maber
Sent: 22 March 2001 17:36
To: [log in to unmask]
Subject: Re: facility and discrimination
As far as I can see discrimination and facility indices were settled
on as the standard analysis tools at a time when computers were not
commonly used to make the necessary calculations. Although they are
simple to calculate they don't give an indication of the significance
of the numbers they produce and I've known groups of academics to argue
about questions with extreme values even though the number of students
who responded was close to the recommended minimum number of responses.
My personal preference is to use AnoVa or correlation (depending on the
scoring scheme for the question) to analyse class performance. A
correlation coefficient has the same usefulness as for example a
disrimination index but it is accompanied by a significance level so
you know when to take it with a pinch of salt.
I can give some details of this if people think it will be of interest
and I would welcome criticism.
Jon Maber
David Davies wrote:
>
> Hi Carole
>
> I'll tell you what we do. For now, forget XML completely. It's not
relevant
> to this discussion.
>
> We've got a web-based MCQ system that pulls MCQs out of a database for
> delivery via the web and pokes the students responses back in. It's no
> different that any other MCQ system that uses server-side processing,
> including all the commercial systems.
>
> Because the results are gathered in a results database, the system
> automatically calculates both the facility and the discrimination. The
> students receives his/her marks and the question owner(s) receive the
stats
> describing the usefulness of their questions. We've spent a bit of time
> developing a tidy web interface to this so that the tutor of lead teacher
> can log in and monitor the performance of their questions at their
> convenience.
>
> Individual questions are coded according to level of difficulty such that
> there are golden questions that define core learning, standard questions
> that have been used before and so we have an idea as to their facility and
> discrimination, and new questions that have been deemed suitable for
> inclusion, but have not been tested 'in the field'.
>
> This utility is for us one of the chief benefits of using computer-based
> assessment.
>
> Cheers,
>
> David
|