Print

Print


Ann,

thanks for the reply, very useful!

Ann Apps wrote:
> Ian,
> 
> I've been doing some work recently harvesting journal article data
from a particular repository (not an institutional one). If I were to
widen the scope of this work and harvest one or more institutional
repositories these are the fields I'd need:
> 
> Article title
> Author names (separate surname and initials at a minimum, or easy parsing)
> Journal title
> ISSN 
> Publication year
> Volume and Issue (as per the journal)
> Pagination (first and last pages)
> Publisher
> Dewey classification - may be multiple (and a means to differentiate this in the metadata from other subject terms)
> Persistent URI
Does this mean you would only be interested in Published works (aka 
"post-prints")?
How essential is Dewey? EPrints.org software ships with LCC, DSpace 
ships with SRSC (Swedish Research Subject Categories) and NSI (Norwegian 
Science Index). We have used JACS in the Depot, and I know that a number 
of IRs don't use anything (or maybe "Department" )

> 
> And optionally good to have:
> eISSN
> Date of publication
> Country of publication
> Abstract
> Copyright statement (though I may construct this by some agreement with the repository)
> Keywords - may be multiple (eg author keywords)
> Global identifiers, eg DOI, PubMed ID
> 
> I may also be interested in conference papers, but haven't looked
> into
that in depth. But the requirements would be similar, with Proceedings
Title and ISBN(s).
> 
> Realistically Dewey may be a bit optimistic...but it is needed for my
> 
application. I'd probably need to devise ways to augment data with this
after harvest.
> 
> I think that simple Dublin Core would not be adequate to provide all
the above, unless there were some conventions about what goes into which
fields (it's the journal details that are problematic) - but that's not
very interoperable.
Have you looked at METS rather than DC?
(I have, and it may have an advantage.... but I'm still working on it)

-- 

Ian Stuart.
Bibliographics and Multimedia Service Delivery team,
EDINA,
The University of Edinburgh.

http://edina.ac.uk/