Dear all,
I'd be interested to know how effective machine subject classification
is able to be, and how often it would make mistakes. Does anybody
actually operate such a system? Not entirely fair, Pete, to use this
email as an example, since everyone on the list knows the context in
which it was written and what was said in earlier posts - the whole
thread would be a fairer test, like a self-contained paper that mentions
(or should do, anyway) all of its terms of reference.
I have to point out that not all repository managers have the expertise
to add subject classification headings by hand, and that this might
perhaps be true for a large minority or even a majority for all we know.
I certainly can't, because I'm not a trained cataloguing librarian,
although no doubt I could learn to do so. I can't help but follow Stevan
on this one, always provided that automatic tagging in fact works in the
manner he describes. Staff time is clearly at a premium.
Whether the user is interested in subject headings or not, plenty of
useful repository services like harvesters can and do use them. I was
just asked whether we can filter papers in DSpace by subject at present
with our in-house tool for generating bibliographies. If this went
across the boundaries of our present research collections, as can
clearly happen in cross-disciplinary research, subject headings could be
really useful. I was told about an example where mathematicians and
physicists had been deliberately put in the "wrong" discipline for
strategic RAE purposes, as a further example.
Thanks,
Talat
-----
Dr Talat Chaudhri, Ymgynghorydd Cadwrfa / Repository Advisor
Gwasanaethau Gwybodaeth / Information Services
Prifysgol Aberystwyth / Aberystwyth University
CADAIR: http://cadair.aber.ac.uk
Cadwrfa ymchwil ar-lein Prifysgol Aberystwyth
Aberystwyth University's online research repository
Ymholiadau / Enquiries: [log in to unmask]
-----Original Message-----
From: Repositories discussion list
[mailto:[log in to unmask]] On Behalf Of Peter Cliff
Sent: 25 June 2008 13:53
To: [log in to unmask]
Subject: Re: subject classification
Hello!
Stevan Harnad wrote:
> Opposing hand-tagging. No view on usefulness of automated tagging
(except
> that I think boolean full-text search is in general far more powerful
> than taxonomy search -- though of course any available taxonomy can
> be covered by the boolean search). -- SH
What if the full-text lacks certain keywords that are relevant for
discovery/searching? I'm not convinced the content of a document is
enough to accurately (and sustainably) describe what it is about. This
email, for instance.
That said, I'm not disagreeing that update of repository deposition has
been slow - but I'm not sure we can blame that on just the need for
metadata!
Pete Cliff
Research Officer, UKOLN
|