No clear answer to this because it very much depends on what metadata
problems you're trying to fix. On one occasion that we had to go back
and change all of the records because of DC standards issues we were
quite lucky that nearly everything could be done as a batch job because
the problems were more or less uniform and comparatively trivial. If
we'd had to do it manually it would have taken several staff days of
time before we could be sure even a simple problem was fixed - and this
was a very simple problem indeed. For example, one headache that
Aberystwyth may face in future is if they decide, as we discussed when I
was there, that the block citations need to be broken up into separate
fields for journal title, page numbers and so on, perhaps for
compatibility with a purchased CRIS solution. For maybe 550 records at
last count and possibly thousands more to be uploaded from REF and
former IGER (a merged institution) databases, that might involve vast
amounts of work on a manual basis, for which the staffing resources
probably don't exist anywhere. As a general rule I think it's fair to
say that one needs to try to avoid any metadata problems that can't be
fixed later by re-mapping fields on an automated batch job basis. I'm
sure other people would have examples of the potential headaches that
could emerge. With respect to SWAP, we need to have a good idea of how
we would generate all those extra complex relationships from simpler
metadata later, I'd have thought, if we are going to seek temporary
time-saving solutions in order to get the bulk of the content, much of
it retrospective. For the present I'm not sure exactly how we might
accomplish that enrichment of metadata after the event: the best moment
to get it right is definitely the point of ingest.
That covers both getting it "wrong" (with the benefit of hindsight but
perhaps not seen as wrong at the time of creation) and having it
incomplete. Adopting a new standard like SWAP might be a reason for it
being incomplete just as much as deliberately ingesting simple metadata
as a holding operation. For whatever reason it may be incomplete, you
need to repeat all of that time-consuming checking that you did in the
first place, since no way will you remember what does and doesn't need
adding to each item, even within one simple type of resource like
papers. Every paper, for instance, is subtly different.
Pete, your last remark about augmentation of metadata after the event is
encouraging and it would be useful to talk about what's possible to achieve.
Talat
Peter Cliff wrote:
> Hey Mahendra,
>
> Mahendra Mahey wrote:
>> I am not sure this is about 'dumbing down' pre-existing beautifully
>> crafted metadata. I think (correct me if I am wrong, Pete, Phil) it
>> is about:
>>
>> * having a strategy to cope with a large of amount of content to
>> deposit into a repository with limited resources and pressures to
>> show repository brimming with 'stuff'
>> * making content available quickly - exposing it to the web so that
>> it can be discovered quickly (hopefully?)
>> * increasing the amount of content in the repository quickly
>> * making a judgement about using a quick fix strategy where there is
>> not simply the time to catalogue the content to the high standards
>> originally started out doing (I am sure Jenny has done the maths
>> in terms of how long it would take to catalogue the content
>>
>> Is that right?
>
> Yep. On all the points. Talat's experience as repository manager
> suggests that adding metadata after the deposit takes a long time -
> Talat, is it longer than it'd be on creation?
>
> I'm not talking about getting the metadata wrong (which I think would
> be a hassle to fix - imagine suddenly realising you had to change your
> subject classification scheme) but getting the metadata incomplete -
> so you have the same problem as creating metadata on submission, but
> delayed so that you can prioritise deposit. (Why do today...? ;-))
>
> As for automated augmentation of metadata - well, that would be doable
> and perhaps should be part of the tool - and from what I know of
> SWORD, it'll allow for metadata updates.
>
> Pete Cliff
> RSP/UKOLN
--
Dr Talat Chaudhri
------------------------------------------------------------
Research Officer
UKOLN, University of Bath, Bath BA2 7AY, Great Britain
Telephone: +44 (0)1225 385105 Fax: +44 (0)1225 385105
E-mail: [log in to unmask] Web: http://www.ukoln.ac.uk/
------------------------------------------------------------
|