I've also been thinking about how to move upstream in the workflow of
campus data creators and publishers. This is the suggestion of the
University of Rochester's paper, Understanding Faculty to Improve
Content Recruitment for Institutional Repositories. Enabling
collaborative work among researchers from disparate units or
universities seems an obvious niche.
One of the obstacles to offering versioned scratch space (isn't that the
essence of a repository space for versioning of objects prior to the
objects being declared final?) for us is the probable space requirement
to store multiple versions of objects. This is, as I understand it,
what University of Hull and University of Prince Edward Island are doing
with Virtual Research Environments- offering uncurated, versioned,
collaboration-enabled space under the assumption that moving finalized
objects into the repository will be extremely simple.
By applying delta technology to save only changes, perhaps space
considerations can be mitigated. I would worry that recreating a
specific intermediate state of an object, should the system fail, would
be that much more complicated, though.
I think you raise a separate question about derivative data and managing
relationships. In the context of archiving geospatial data, we've been
thinking about applying a FRBR model to collections and constituent
objects. Ideally, we'd like to embed a persistent identifier in the
metadata of each object that resolves to a relationship record. It
seems especially useful in light of the temporal aspect of our content.
Thanks for raising the topic,
Chris Rusbridge wrote:
> My earlier note about how the R word was mostly being used for something
> else, and in particular for source code version control repositories,
> has been swirling around in the back of my brain for a few days, bumping
> into other stuff. In particular, I began to wonder whether there are
> elements of the typical source code repository that we could usefully
> use for our repositories. Now this thought is neither new nor original;
> I remember commenting in a blog post in August last year on Peter
> Murray-Rust's epiphany from April 2007
> (http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=259) that SourceForge was
> a repository. But I don't think I've seen the ideas contrasted yet, say
> in the context of what IRs could usefully take from source code
> A lot of what goes into source code repositories is about managing
> change: keeping track of versions, and ensuring that separate people are
> not changing the same element at the same time. There are also
> presumably sophisticated facilities for constructing what one might call
> derivative products (compiled versions, libraries, etc; it's a long time
> since I used one of these things in production!).
> IRs and related repositories have traditionally not been about change;
> they tend to be about maintaining a static version (I won't say
> "preserving", as it appears some object to that idea). However, the idea
> of moving the repository upstream into the researcher's workflow, as in
> the idea of a Research Repository System (eg
> This does imply managing change much more. Besides, we're beginning to
> be troubled by multiple version problems, and we certainly have
> derivative products (from simple Word -> PDF transformations, to more
> unclear pre-print -> post-print relationships).
> So my question is: has this comparison of IR platforms to source code
> repository systems been done, or is anyone doing it?
> Chris Rusbridge
> Director, Digital Curation Centre
> Email: [log in to unmask] Phone 0131 6513823
> University of Edinburgh
> Appleton Tower, Crichton St, Edinburgh EH8 9LE
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
> On 10 Mar 2009, at 17:44, Chris Rusbridge wrote:
>> I have [a twitter search] for "Repository OR Repositories". I just did
>> a quick count; with around 160 tweets found in the past 2 days
>> containing one of those words, only 14 had anything to do with the
>> sort of repositories this list is interested in!
>> Most of the rest appear to be to do with SVN and git etc version
>> control repositories. Quite a lot appear to be the simple dictionary
>> meaning of places to store something.
>> I hadn't quite realised how much we are overloading someone else's
>> vocabulary with the R-word!
Digital Repository Librarian
Digital Library Initiatives