JISCMail - JISC-REPOSITORIES Archives

Don't bother with any subject classification whatsoever at the individual
IR level. Leave that to the harvesters and automatic extraction from
the full texts using AI. It is a waste of time to classify at the IR
level; next to no one will use it; it deters authors from self-archiving
if they need to do classifying too; it wastes librarians' valuable time if 
they need to do the classifying; and, in the harvested, OAI-compliant,
full-text boolean-searchable OA era, it is simply obsolete (for journal
articles). Let's just focus on reaching 100% OA and the rest will take
care of itself...

Stevan Harnad

On Mon, 13 Mar 2006, Sally Rumsey wrote:

> Roddy and all,
> 
> I would like to use some sort of subject classification for our IR
> (http://eprints.lse.ac.uk/. The question is what. In our Library
> catalogue we use Library of Congress subject headings. There is no way
> someone self-depositing could tackle that without specialist training as
> it's too big. We have the default LC subject classifications in
> ePrints.org at the moment - but they're not detailed enough - we'd need
> to edit them quite drastically as we only have to consider social
> sciences and engineering etc is out of scope. We haven't done anything
> with this list yet - too many other things to tackle
> 
> Another option would be for us to use IBSS subject headings. Those too
> are very extensive and are unlikely to be used by other IRs. We'd like
> something that is popular with others.
> 
> One further option which may fit the bill is the use of HILCC headings
> as developed at Columbia Uni. See
> http://www.columbia.edu/cu/libraries/inside/projects/metadata/classify/
> 
> We haven't investigated properly yet, and I don't know if we'd need
> formal permission to use them. This would fit with our use of LCSH,
> would not be too onerous for self-depositors and is also used by Serials
> Solutions which we use.
> 
> Author assigned keywords and full text searching are ok to a point, but
> good subject headings will improve browsing for users. I'd be interested
> if anyone has any further thoughts on this. If some consensus between
> repositories were to be agreed then federated searching will be ok. We
> haven't got enough content in our IR for it to be a problem - yet.
> 
> Sally
> 
> Sally Rumsey
> eServices Librarian
> Library
> London School of Economics & Political Science
> 10 Portugal Street
> London
> WC2A 2HD
> 
> 020 7955 7943
> [log in to unmask]
> 
> -----Original Message-----
> From: Repositories discussion list
> [mailto:[log in to unmask]] On Behalf Of MacLeod, Roderick
> A
> Sent: 13 March 2006 15:38
> To: [log in to unmask]
> Subject: Re: Use of Navigational Tools in a Repository
> 
> In a relatively small database like http://eprints.soton.ac.uk/ 
> the likelihood of a user finding relevant resources using the Browse by
> Subject http://eprints.soton.ac.uk/view/subjects/ or Browse by School or
> Research Group http://eprints.soton.ac.uk/view/structure/ is fairly low
> unless they have prior knowledge of the existence of something of
> relevance.
> 
> The usefulness of subject classification in repositories, from the
> information retrieval perspective, grows once numerous repositories are
> harvested together and access is facilitated via an aggregated subject
> interface.  This much increases the likelihood of a potential user to
> find material on any particular subject.
> 
> This has been recognised elsewhere: "Ultimately, most seekers and users
> of scholarly information are persuing a topic or train of thought.
> Although the publisher, author, and the institution with which the
> author was associated may be of some interest to seekers and users of
> scholarly information, usually those interests pale in comparison to the
> topic (and scholarly task) at hand.  Ultimately, a good, user-centric
> scholarly information system must meet the needs of students and
> scholars. These end-users need a system that enables broadcast searching
> across a wide variety of e-print servers, digital libraries, and
> institutional digital repositories to identify and retrieve potentially
> pertinent scholarly content". Peters, T.A. (2002). Digital repositories:
> Individual, discipline-based, institutional, consortial, or national?
> The Journal of Academic Librarianship, 28(6), pp. 414-417. 
> 
> And 
> 
> "We feel more strongly than ever that there are significant advantages
> to a disciplinary approach to electronic services supporting advanced
> scholarship and higher education".  They continue "Unfortunately, we
> have seen little of the structure of the disciplinary community in
> electronic services." Stephen, T. and Harrison, T. (2002). Building
> systems Responsive to Intellectual Tradition and Scholarly Culture. The
> Journal of Electronic Publishing, 8(1).  
> 
> Both reported in http://www.icbl.hw.ac.uk/perx/analysis.htm 
> 
> Roddy MacLeod
> 
> > -----Original Message-----
> > From: Repositories discussion list 
> > [mailto:[log in to unmask]] On Behalf Of Leslie Carr
> > Sent: 9 March 2006 00:38
> > To: [log in to unmask]
> > Subject: Use of Navigational Tools in a Repository
> > 
> > A recent discussion between some colleagues on the utility (or
> > otherwise) of subject classification in repositories prompted 
> > me to undertake a brief investigation whose results I present 
> > here. (I'll also send this to AMSCI, so apologies for any 
> > duplicate copies that you see.) The discussion has broadly 
> > been between computer scientists and librarians over whether 
> > subject classification schemes offer advantages over 
> > Google-style text retrieval; the study below looks at the 
> > evidence as demonstrated in the usage of one particular 
> > repository. As such it doesn't address the intrinsic value of 
> > classification, but it does offer some insight into the 
> > effectiveness of navigational tools (including subject 
> > classification) in the context of a repository.
> > 
> > ----------------
> > The University of Southampton Institutional Repository has 
> > been in operation for a number of years and an official 
> > (rather than experimental or pilot) part of its 
> > infrastructure for just over a year. As part of its 
> > capabilities, it includes lists of most recently deposited 
> > material, various kinds of searches, a subject tree based on 
> > the upper levels of the Library of Congress Classification 
> > scheme and an organisational tree listing the various 
> > Faculties, Schools and Research Groups in the University and 
> > a list of articles broken down by year of publication. These 
> > all provide what we hope are useful facilities for helping 
> > researchers find papers (ie by time, subject, affiliation or content).
> > 
> > Over a period of some 29.5 hours from 0400 GMT on March 7th 2006,
> > 1978 "abstract" pages (ie eprints records) were downloaded 
> > from the repository (ignoring all crawlers, bots and spiders).
> > 
> > Of the 1978 downloaded pages, the following URL sources 
> > (referrers, in web log speak) were responsible:
> >    439  - (direct URL, perhaps cut and paste into a browser 
> > or clicked on from an email client)
> >    225  EPRINTS SOTON pages
> >      25  OTHER SOTON WEB pages
> > 1264 EXTERNAL SEARCH ENGINES
> >      21  EXTERNAL WEB PAGES
> > 
> > ie the local repository facilities, including subject views 
> > and searches, led to only 225/1978 = 11% of all downloads.
> > 
> > 
> > 
>