As institutional RDM support services are getting better
established, are any patterns emerging in which domains are taking
them up?
I changed the subject line as it has gone off the track of
Gareth's question, and Tim Banks' point about the mismatch between
active data and preservation file formats. The connection to my
question is that while we can offer people task models and cost
categories, I would guess researchers in some domains are more
receptive than others to the idea that they could apply these
models and checklist without radical changes to their practice.
If you are involved in running an institutional RDM service are
there any domain differences that stand out in your experience?
(I'm thinking in terms of advice being followed up, through to
stuff actually being offered for deposit somewhere) If so, what do
think are the characteristics of those domains that bring
researchers to engage with an institutional service?
I'm prompted to ask by Norman's points below, which I recognise
from case studies that I and others did a while back (e.g. for RIN
and DCC SCARP projects). I came across work at that time by
information scientist Jenny Fry. She looked at differences between
domains in their "degree of production and use of scholarly
networked digital resources" and found the best explanatory
factors for success (or maturity may be a better word) in that
were high
‘mutual dependence’ between researchers and low ‘task
uncertainty’, with High Energy Physics being one example (
https://dspace.lboro.ac.uk/2134/11350)
I think something similar works towards domains' capability to get
their archiving and curation off the ground and more formalised.
That was also my experience from case studies. Being faced with
the issue of understanding a colleague from another domain's data
and methods, and finding somewhere to store stuff, is a catalyst
towards taking archiving seriously. And I can see Norman's 'eating
your own dogfood' factor at work in how open notebook science has
developed in physical sciences, e.g. Labtrove at Southampton.
Hazarding a guess at which research groups would be best to
proactively engage with, to build the take-up of an institutional
data repository service, it would be in research groups that work
in medium sized teams, have high data collection costs, and want
to work more with either different kinds of data or researchers
than they are used to dealing with. I'm guessing that researchers
who work mostly individually, or with cheap and readily available
source data, are least likely to deposit with the institution.
I appreciate that individual project and personal factors might
be just as important here as any characteristics of the domain.
Either way, I'm interested to know what do you think? How diverse
or otherwise is takeup of your RDM service?
thanks,
ANgus
On 09/10/2014 21:15, Norman Gray wrote:
[log in to unmask]"
type="cite">
I think, however, there are a couple of high-level reasons _why_ this happens in these domains, which may be portable to other domains, with the same effect.
First, because the data is produced, and after it's produced successively refined, by rather complicated processes, and because the people producing the data are often not the same as the people using it, the natural way for that data to be communicated is through an internal repository, rather than passed on from point to point or person to person. That requires an up-front investment of time, and a continuing investment of discipline, but it's a pretty efficient way to share material internally to the project, which obviously provides a very convenient starting point for later archiving.
Second, another way of thinking about that is 'dogfooding', as in 'eating ones own dogfood' (computer scientists seem to talk about this a lot). If a project is intended to provide resources for the wider community -- data, services, catalogues, whatever -- then if the project takes a deliberate decision to do its _own_ work only using the final public interfaces rather than using any project-only routes, then there's a _very_ strong pressure to make those interfaces as usable and as useful as possible. The result will probably turn into a more naturally archivable product.
One point we were making in the document I quoted was that an approach like this means that the 'archiving' costs can be subsumed into an 'infrastructure' budget line. That might make them less prominent and so less 'cuttable'.
I have slight tunnel vision on this, of course, and as Tim Banks explained, sometimes working formats are unavoidably different from archival formats. But if 'archiving' can be reconceived as a adjunct to another process in a project, one way or another -- as opposed to an annoying, expensive, and forgettable external obligation -- then I suspect that will often be both more effective and cheaper.
Best wishes,
Norman