We investigated this national service idea a few years ago and decided it
wasn't really practical. There are a number of well established discipline
based services and national units to develop good practice, most notably
DCC. From there we concluded that most practical way forward was
institutional repositories using standard tools and common good practice.
Since then products like Dspace Arkivum and Preservica have all matured and
can offer an effective hybrid cloud model for active use, sharing and
preservation. There are many other products around now too that can be used
if the DCC tools are used to establish policy and planning.
Janet (Jisc) has been working to get national frameworks for many of these
products and will respond to demand, so if you want a product why not use
Janet to help with the procurement and then one deal becomes a deal for the
whole sector.
Hope that helps
John
John K. Milner
Meadow House
Baunton
Cirencester
GL7 7BB
Tel 00 44 1285 643731
Mob +44 7836 341550
Mail to: [log in to unmask]
-----Original Message-----
From: Research Data Management discussion list
[mailto:[log in to unmask]] On Behalf Of Anna Clements
Sent: 15 October 2014 21:17
To: [log in to unmask]
Subject: Re: Research data quota takeup
Bill
Couldn't agree more on your plea for national infrastructure RaaS for the
long tail stuff which doesn't fit into existing subject specific
repositories... although think we need more of the latter too. StaaS ...
absolutely .. presumably what Arkivum and others are offering ... assuming
we cab get the integration with our existing systems ..dSpace, Pure, etc to
work ok,
Anna
______________________________________________________
Anna Clements | Head of Research Data and Information Services
University of St Andrews Library | North Street | St Andrews | KY16 9TR|
T:01334 462761 | @AnnaKClements
________________________________________
From: Research Data Management discussion list
[[log in to unmask]] on behalf of Worthington, William
[[log in to unmask]]
Sent: 15 October 2014 13:32
To: [log in to unmask]
Subject: Re: Research data quota takeup
All,
at University of Hertfordshire (UH) we have been kicking around the RDM
problem since JISRCMRD 2011-2013 so I have been watching this discussion
with interest as newer heads have come to the table.
UH is following the same strategy and approach as put by Aslam at
Birmingham. It seems entirely pragmatic when you can not put your arms
around the problem.
We have acquired ~ 100TB of tier 2 storage which will be backed up to tape
for device level recovery only (that is: we won't offer file level recovery
to individual users). This doesn't sound like a lot but given the size of
our research endeavours it is a good start from which to build a demand
driven case for investment. As Tim alluded to, we also have a couple of
research groups who could fill this overnight but these are relatively well
self served already, and not the target market. I see the big wins in terms
of mitigated risk as being with Kevin's 90-95%.
We also did a DCC DAF audit,
http://research-data-toolkit.herts.ac.uk/2012/08/data-asset-survey-results/
and although it was a fairly low turnout it was consistent with Tom's
account from Nottingham and several other JISCMRD projects, so we were
inclined to believe it. Thus, our default offer will be 50GB. However we
have established an RDM triage with the PI for each new funded award and if
that reveals a greater demand we will accommodate =< 5TB on the basis of
need. (I know - we may find the horse has bolted).
For archival storage have acquired a smidgeon (10TB) of Arkivum A-stor for
10 years and are bolting it onto our institutional repository (dSpace) in
order to support long term preservation of datasets. (Again, if we get
crushed in the rush - I see this as a good thing). For reasons too arcane
for this discussion this has taken longer than I had hoped, but we are
nearly there. But this brings us to an important point - very roughly
speaking we will spend 30k on datasets@UHRA including twice on development
what we spent with Arkivum. And this before we get into really significant
sized data. So to take up Anna's point - can the sector afford this? Even if
it can, our experience scales to several million pounds to develop a
plethora of different solutions. Seems a little inefficient to me.
Also on the point of the sustainability of us all doing our own thing -
there are two factors here: economy of scale vs. sustainability of the data
host. I have heard it expressed that funding bodies regard HEI's as far more
stable and likely to be more long lived that any national or domain specific
service. Counter this with the benefits of community of a domain specific
service and the economies of scale offered by a national storage service.
(To this RDM geek, it would be great to imagine a storage/archive service
equivalent to the JANET network which we could take for granted, like water
or air. Sadly, even-toed ungulates donıt fly).
The JANET framework agreements are trying to bring some the economies of
scale and HEI friendly T & Cs directly to individual HEIs and I think these
are a good thing. But they are only part way to storage (StaaS) or
repository as a service (RaaS) from which smaller institutions in particular
could really take benefit. I made this point at a JANET workshop on storage
in 2013 and again recently in a meeting about JISC's upcoming 'Research at
Risk' work, which as I understand it, will be service rather than project
focused. Just as some of us are taking a punt (a pragmatic approach, in
making a tentative offer, to satisfy a nebulous demand, that policy suggests
should exist) - so wouldn't it be fantastic to see a (StaaS) or (RaaS) offer
at a national level? It might just be wildly successful enough to
demonstrate demand, cost benefit, and, a sustainable model.
Yours, with not enough bytes, Bill
------------------------------------------------
Dr. W J Worthington
University of Hertfordshire
T: +44 (0)1707 284000 ext. 77883
E: mailto:[log in to unmask]
On 15/10/2014 09:30, "Aslam Ghumra (IT Services, Facilities Management)"
<[log in to unmask]> wrote:
>Hi Antony,
>
>Currently we have 300Tb of replicated and backed up (part of it)
>storage as we have two data centres on campus. However this is just our
>toe in the water and we will need a lot more storage. We need to be
>seen to provide the storage, to create the demand, therefore
>oversubscription is the key. We would like to offer all our active
>researchers the minimum of 5Tb of free work in progress storage (RDS).
>Thatıs a lot of storage, approx. 14Pb ( if my sums are correct),
>however this will be phased in, but not to this amount. There will be
>have to be a PR exercise in bringing in those projects deemed very
>import, which will then be used to leverage further funding from the
>University and to try and bring in monies from grant proposals ( however
thatıs another issue ).
>For Tier1 we won't be using 'cloud' storage, however we may do for Tier2.
> We have 210Tb of Tier2 which is co-located at the University of
>Nottingham, part of the MidPlus consortium.
>
>On costs, not sure but we are making the case for a sustained opex
>every year to grow the solution. We are also putting the research data
>storage on a dedicated research data network, where we can attach
>equipment that can dump large quantities of data, to the extent that
>large data transfers can be taken off the University 'user' network.
>
>Aslam Ghumra
>Research Data Management
>T: 0121 414 5877
>Skype : JanitorX
>
>***********************************************************************
>
>------------------------------
>
>Date: Tue, 14 Oct 2014 10:07:27 +0000
>From: "Antony Corfield [awc]" <[log in to unmask]>
>Subject: Re: Research data quota take up
>
>Hi Aslam, that's quite impressive, so if you have say 100 concurrent
>research projects you're able to provide 0.5 Petabytes of (RDS) storage
>for free. Does Tier 1 storage include mirroring and nightly backups or
>is this 'Cloud' storage and what do you estimate this cost is to the
>institution?
>
>Regards,
>Antony
>
>
>
>***********************************************************************
=
|