On 13/10/14 16:18, Jeremy Harrington wrote:
> Just to clarify the comment re the Institute of Cancer Research, the point I
> think Richard is referring to is that we encourage (but do not mandate) next gen
> sequencing users to throw away any large data resulting from early and
> intermediate steps in analysis and processing relatively quickly and when
> appropriate. Otherwise there's a tendency to forget about it or let it sit on
> expensive scratch disks forever.
>
> So the comment is specific to NGS data and pipelines and not general research
> data. I’d hate people to think we encourage researchers to discard research
> data generally!
>
No need to be concerned - I think you are setting a good example to others in
encouraging people to manage disposal as well as retention of data. One
paradoxical conclusion from better data management is that the result will
inevitably mean we end up discarding even more data than before. The cost
of generating data is falling far faster than the cost of storing it, so
more disposal is inevitable.
We want to aim for a situation where we are in greater control of the process;
instead of losing stuff arbitrarily, we get rid of things in a way that's
driven by policy and can be documented. We won't always get it right but it's
better than any of the alternatives.
Time for another shameless plug for the DCC guidance in this area:
http://www.dcc.ac.uk/resources/how-guides/appraise-select-data
It's 4 years old now but still relevant. We've got something new in the
pipeline to provide a simple checklist based on this guidance.
--
Kevin Ashley. Director, Digital Curation Centre http://www.dcc.ac.uk/
E: [log in to unmask] @kevingashley http://slideshare.net/kevinashley
T: +44 131 651 3823 P: DCC, Appleton Tower, Crichton St, Edinburgh EH8 9LE
M: +44 7817 402 498 DCC Helpdesk: +44 131 651 1239
|