Ian,
Yes, the particular application I'm working on is about Published works. I realise that there could be many other possible applications for harvested data that would be interested in other works. But I was just talking about my particular use.
The reason for Dewey is because that is what is provided by the application - note I am enhancing an existing application not starting from scratch. Actually LCC (Library of Congress Classification Number) would probably be a viable alternative. But the same as for Dewey, I'd expect the classification scheme to be indicated in the harvested metadata. To use other classification schemes would require look-up tables (or maybe HILT).
In fact the repository I've been harvesting so far doesn't provide any formal subject classification, and I've been adding Dewey terms using a lookup table based on the journal (using ISSN as the identifier). I should have said before - the Dewey terms apply to the journal level not individual articles, although that would be possible if available.
Actually a decision was made quite some time ago that the backbone subject classification scheme for the JISC Information Environment is Dewey. (Don't shoot me down, I'm only reporting this!) Because of that, another application I work on (different from and unrelated to the one harvesting from repositories) uses Dewey as its backbone subject classification scheme.
I haven't looked at METS - I thought it was a packaging format. Or did you mean MODS? - I think that would be a possible format. However I'd assumed you would be using SWAP - I thought uses like mine are what it is for.
But in fact as a consumer of harvested metadata I'm not really bothered what XML schema it uses provided I can pick out from it the bits of information I need.
Best wishes,
Ann
-------------------------------------------------
Ann Apps MBCS CITP. Research & Development, Mimas,
The University of Manchester, Oxford Road, Manchester, M13 9PL, UK
Tel: +44 (0) 161 275 6039 Fax: +44 (0) 161 275 6040
Email: [log in to unmask] WWW: http://epub.mimas.ac.uk/ann.html
--------------------------------------------------
> -----Original Message-----
> From: Repositories discussion list [mailto:JISC-
> [log in to unmask]] On Behalf Of Ian Stuart
> Sent: Tuesday, March 11, 2008 11:25 AM
> To: [log in to unmask]
> Subject: Re: [JISC-REPOSITORIES] Central versus institutional self-archiving
>
> Ann,
>
> thanks for the reply, very useful!
>
> Ann Apps wrote:
> > Ian,
> >
> > I've been doing some work recently harvesting journal article data
> from a particular repository (not an institutional one). If I were to
> widen the scope of this work and harvest one or more institutional
> repositories these are the fields I'd need:
> >
> > Article title
> > Author names (separate surname and initials at a minimum, or easy parsing)
> > Journal title
> > ISSN
> > Publication year
> > Volume and Issue (as per the journal)
> > Pagination (first and last pages)
> > Publisher
> > Dewey classification - may be multiple (and a means to differentiate this in the
> metadata from other subject terms)
> > Persistent URI
> Does this mean you would only be interested in Published works (aka
> "post-prints")?
> How essential is Dewey? EPrints.org software ships with LCC, DSpace
> ships with SRSC (Swedish Research Subject Categories) and NSI (Norwegian
> Science Index). We have used JACS in the Depot, and I know that a number
> of IRs don't use anything (or maybe "Department" )
>
> >
> > And optionally good to have:
> > eISSN
> > Date of publication
> > Country of publication
> > Abstract
> > Copyright statement (though I may construct this by some agreement with the
> repository)
> > Keywords - may be multiple (eg author keywords)
> > Global identifiers, eg DOI, PubMed ID
> >
> > I may also be interested in conference papers, but haven't looked
> > into
> that in depth. But the requirements would be similar, with Proceedings
> Title and ISBN(s).
> >
> > Realistically Dewey may be a bit optimistic...but it is needed for my
> >
> application. I'd probably need to devise ways to augment data with this
> after harvest.
> >
> > I think that simple Dublin Core would not be adequate to provide all
> the above, unless there were some conventions about what goes into which
> fields (it's the journal details that are problematic) - but that's not
> very interoperable.
> Have you looked at METS rather than DC?
> (I have, and it may have an advantage.... but I'm still working on it)
>
> --
>
> Ian Stuart.
> Bibliographics and Multimedia Service Delivery team,
> EDINA,
> The University of Edinburgh.
>
> http://edina.ac.uk/
|