I agree with everything Louise and Libby already said, and think the
UKDA costing tool is likely to be useful beyond social science data
(though I have never personally had to apply it to any serious extent).
The time that's typically needed to prepare for archiving will depend a
great deal on the maturity of domain-specific metadata standards, and
method-specific data cleaning tools. I can see that what Norman is
saying is relevant in industrialised domains that focus archiving on
data that's taken from large-scale instruments and assembled for
analysis through automated workflows, but not wider than that (and I
understand even in big science fields archiving is not very mature where
the derived data products are concerned).
As we all know, the majority of domains have neither a domain metadata
standard nor a domain repository, so I think the relevant questions for
researchers are probably the obvious ones like 'how much effort will it
take to make your data understandable to someone else in your field,
beyond what they can understand by reading the paper you write about it'
and ' what kinds of things would you use to describe your data to
someone else in your field at earlier stages in the research, e.g.
emails, readme files, posters, protocols, notebooks... and can these be
added to the collection you plan on depositing'?
Where there is a domain repository/archive, the pre-archiving (ingest)
effort will depend on how much demand there is from data reusers for it
to support domain metadata standards, e.g. through search tools, and how
far the repository in turn can expect its depositors on the supply-side
to do the tagging during or before ingest (as opposed to mining it). The
more comprehensive and structured the domain standard the more time this
will take, until such time as the standard gets enough community buy-in
to spawn more standardised workflows, that may be engineered to allow
metadata to be added incrementally throughout a project (as in the
'sheer curation' idea).
I expect the minority of domains that have a domain repository is
growing (not sure if directories like re3data.org are growing because
they're capturing real growth in numbers or just capturing more of
what's already there?). Probably they will grow faster than the
availability of domain-specific metadata editors (or linked data
skills). So across the board that might mean more depositors spending
more time just filling in forms on repositories. Or maybe it's more
likely that generic cross-domain repository platforms will deal
intelligently with any file that's identified as a domain metadata
source for the collection being deposited?
best wishes,
Angus
On 08/10/2014 16:38, Cole, Gareth wrote:
> Thanks all for your responses to my enquiry and particularly to Libby and Louise for the link to the costing tool on the UKDS and UKDA websites (I'm not sure how I haven't noticed this before!).
>
> Regards
> Gareth
>
> -----Original Message-----
> From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Sebastian Rahtz
> Sent: 08 October 2014 16:07
> To: [log in to unmask]
> Subject: Re: Time needed to prepare files for achiving
>
> the flippant amongst us now have to answer the question, of course, of "so how much will it cost to re-engineer my research workflow to be archive-ready from the start".
> --
> Sebastian Rahtz
> Director (Research) of Academic IT
> University of Oxford IT Services
> 13 Banbury Road, Oxford OX2 6NN. Phone +44 1865 283431
>
> Não sou nada.
> Nunca serei nada.
> Não posso querer ser nada.
> À parte isso, tenho em mim todos os sonhos do mundo.
>
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
|