Hi,
I would say greetings from sunny Canberra, but it's pouring with rain
at the moment ;-).
Anyway, to business:
There's two real problems with institutional repositories:
(a) scope & scope creep
(b) non-linear objects
(a) scope and scope creep
IR's are really designed to hold collections of static objects, be they
documents, digitised recordings, satellite imagery or whatever, plus
information relating to their context - metadata.
As soon as you end up with serious amounts of data - say tens of
terabytes - the costs of maintining this data in terms of backup
requirements, hardare and software costs, recurrent costs, not to
mention staff time are a significant cost of operating the repository
and need to be funded from somewhere.
To make these costs predictable, one needs to define a scope, saying
what will go in and what will not, as this allows one to make an
estimate, however imperfect, of the quantity of data and it's likely
growth and hence its recurrent costs.
Scope creep is a problem, as people start doing new things over the
lifetime of the repository, chnage the types and quantity of the data
archived and hence the ongoing costs. Unmanaged creep is a problem,
managed creep should be less of a problem, or at least something that
could be built into a funding model.
Managing scope and scope creep gets rid of the problem of what to do
with the data you don't archive as at least you've hopefully taken an
active decision not to handle it (I am of course being naive in this
view but it's a good starting point for the argument). It also allows
you to manage ongoing costs - which is the real problem of IR's - they
cost money, just like libraries do.
b) non-linear objects
Really I mean objects that don't readily fit the simple static linear
object model implicit in most repositories. One example we're dealing
with at the moment is someone who has asked us to archive an
anthropology research website which is about to go offline due to the
site's maintainer retiring. As it's been developed as a website, it's a
proper hypermedia document - archiving the pages separately destroys
much of the value, and saving it as a compressed archive destroys much
of its immediate value to researchers.
As so far it's a one off case we've decided to go for the pragmatic
solution of simply replicating the website for the moment but I can't
help feeling we should be doing more in terms of indexing the content -
perhaps by adding components to the main repostiry as external resources
which are catalogued in the repository but not archived.
Currently I don't have a firm opinion on this one, and I'd like to see
some discusion on this area
-Doug
--
Doug Moncur
Information Technology Manager
Australian Institute of Aboriginal and Torres Strait Islander Studies
GPO Box 553 Canberra ACT 2601 Australia
ph: +61 2 6246 1102 | fx: +61 2 6261 4285 | wb: www.aiatsis.gov.au
>>> Bryan Lawrence <[log in to unmask]> 17/01/2006 20:32:11 >>>
Hi Folks
> Firstly, the notion that one 'institutional repository' should hold
all of
> a university's e-objects is an absurd one, and generally recognized
by my
> audiences as soon as I say it.The present state of software does not
> support such a scheme, nor are the characteristics of the objects
anywhere
> near uniform. A great deal of time and money is wasted by people who
> haven't yet realized this simple fact. A university needs several
> 'e-repositories' or 'e- libraries', whatever you call them.
This resonated with me. I blogged about this some time ago:
http://home.badc.rl.ac.uk/lawrence/blog/2005/03/31/function_creep_and_institutional_repositories
But it's not just about software as that blog tries to say, the reality
is
that once IRs go outside documents into research data, the concept of
preservation becomes a lot about having a designated user community and
the
(funded) ability to keep track of their wants and requirements. No
institutional IR is ever going to be able to migrate all its e-objects,
and
shouldn't pretend that it can.
Bryan
--
Bryan Lawrence
Director of Environmental Archival and Associated Research
Head of the NCAS/British Atmospheric Data Centre
CCLRC, Rutherford Appleton Laboratory
Phone +44 1235 445012; Fax ... 5848; Web: home.badc.rl.ac.uk/lawrence
|