I realise that I had not looked up and included the URL for the UKDA 'webinar' on data identfiers and their strategy for assigning DOIs to dynamic datasets: http://www.jisc.ac.uk/events/2012/04/webinardataidentifiers.aspx
Best,
Simon.
________________________________________
From: Simon HODSON
Sent: 03 August 2012 13:50
To: Research Data Management discussion list
Subject: RE: advice on publishing and citing data
Dear Alistair; all;
This has been an extremely interesting thread.
Going back to Alistair's original message, there seem to be a number of related questions.
1) What to do with the data associated with the recently published paper?
2) Where to host / publish such datasets?
3) How to cite (and by implication identify) a changing / periodically updated dataset.
I thought it might be helpful to go over these here:
1) What do do with the data associated with the recently published paper:
For the paper and associated data I would suggest at least for these that you deposit both with Dryad. This solves the problem of citation and identification. As others have commented, Dryad explicitly associates the data and paper and suggests a form of citation for both as well as providing a DOI for the data.
This would seem sensible for the current article and paper, but there are, as you point out, more ongoing issues. I am pretty sure, as Mummi is, that standalone submissions are possible. Contact Todd Vision <[log in to unmask]> and Ryan Scherle <[log in to unmask]>
2) Where to host / publish such datasets?
Your project is ongoing and the evolving dataset is a key output. Ideally this should be deposited in a community/subject-based data archive, or in an institutional data repository.
There is a keen interest in quick data publication, data papers as you suggest - however, I do wonder whether for the sort of exercise you are involved in whether publishing the updated dataset to a repository of record is a more workable solution in the longterm? Prima facie, it looks as it the KNB, suggested by Matt Jones, might fit the bill?
For the record, there are a number of current and emerging endeavours to list such data archives:
DataCite: http://datacite.org/repolist
DataBib: http://databib.org/
Re3Data: http://www.re3data.org/
In my view, any significant centre of research or research institution should think either about contributing to the development and running of a community data archive or running its own data repository. Where community data archives do not exist, research institutions should run their own. The data created by research projects are a valuable asset and should be curated, published and preserved, deserving as much attention as the published article. Has the Centre for Genomics and Global Health considered running its own data repository?
For the latter option of running a data archive for the centre, at least as an interim measure, software is available. The software produced by the DataFlow project may be of interest: http://www.dataflow.ox.ac.uk/
Relatedly, the work of the SMDMRD project, using some dataflow software with a DSpace repository, could be useful: http://rdm.c4dm.eecs.qmul.ac.uk/category/project/smdmrd
CKAN is widely used for government data http://ckan.org/instances/ and the JISC Orbital project is currently looking at it's suitability for their institution http://orbital.blogs.lincoln.ac.uk/2012/07/26/orbital-team-meeting-notes-26-07-12/
All the projects in the JISC Managing Research Data Programme are looking at these issues from the perspectives or universities and research centres: http://www.jisc.ac.uk/whatwedo/programmes/di_researchmanagement/managingresearchdata.aspx
Finally...
3) How to cite (and by implication identify) a changing / periodically updated dataset.
I would also, like Monica, point to the DCC Briefing Paper on Datasets and Link to Publications:
http://www.dcc.ac.uk/resources/how-guides/cite-datasets
There is an in depth analysis of various citation styles in Alex Ball's presentation http://www.bl.uk/aboutus/stratpolprog/digi/datasets/workshoparchive/ball-metadata-citation.pdf given at a recent workshop on data citation and metadata at the British Library http://www.bl.uk/aboutus/stratpolprog/digi/datasets/workshoparchive/archive.html
For the use of DOIs with dynamic and updated datasets, I suggest listening to this 'webinar' from the UKDA http://www.jisc.ac.uk/events/2012/04/webinardataidentifiers.aspx and also Louise Corti's presentation from the same BL DataCite workshop series: http://www.bl.uk/aboutus/stratpolprog/digi/datasets/workshoparchive/LousieCortin_IdentifiersForTheUKDA_May2012.pdf
You might also want to have a look at Ryan Scherle's presentation on 'Creating Citable Data Identifiers' from the recent Open Repositories 2012 Conference: https://www.conftool.net/or2012/index.php?page=browseSessions&form_session=18&CTSID_OR2012=S6JZefzPsEZJz8e8nRmUPju-WK2
The British Library team will be running a small technical workshop on assigning DataCite DOIs on 10 September: contact 'Datasets' <[log in to unmask]> for details.
Best wishes,
Simon.
*****
Programme Website: http://bit.ly/jiscmrd2011-13
Community Discussion List: [log in to unmask]
Blog: http://researchdata.jiscinvolve.org/
Programme Tag: #jiscmrd
*****
Twitter, Skype: simonhodson99
Calendar: http://bit.ly/simonhodson99-calendar
*****
Dr Simon HODSON
Programme Manager – Managing Research Data
JISC Executive
Brettenham House (South Entrance)
5 Lancaster Place
London WC2E 7EN
E: [log in to unmask]
M1: +44 (0) 7545 524 009
T: +44 (0) 203 006 6071
________________________________________
From: Research Data Management discussion list [[log in to unmask]] On Behalf Of Monica Duke [[log in to unmask]]
Sent: 03 August 2012 13:15
To: [log in to unmask]
Subject: Re: advice on publishing and citing data
Hi Alistair,
First of all, thank you for your question, as your description of the problem is a really helpful contribution to the discussion of data citation requirements. You've had some helpful answers offering possible solutions. I don't know how extensively you had researched the area of data citation already (so apologies if any of the below is obvious). I was going to answer from a slightly different perspective.
I wanted to suggest that you seem to have identified two slightly different, albeit connected, requirements. The need to archive your data persistently, and the need to reference and cite it persistently. Arguably, as you somewhat imply, aiming for the latter requirement does not make a whole lot of sense without trying to satisfy the former. Some systems (like the ones suggested) may offer you a solution to both, however you could separate out the two requirements and consider separate solutions. I wasn't sure if the 'archiving' of the data was already met through the MalariaGen system - was it mainly persistent citation and lightweight paper solutions that you were looking for? There is some further general background on infrastructure and identification/citation mechanisms in the DCC guide on data citation in case you have not seen it already (disclosure: I am a co-author)[1]. The publishing of a data paper that you seek (ie the narrative to go with the data) might be yet a third, separate but connected, requirement.
As mentioned in the DCC guide, in the JISC-funded SageCite project [2] the focus for citation was more on workflows, and we worked with the Taverna team, using myExperiment to store workflows,and experiment with DOIs. Sage Bionetworks who were partners in that work, offer the Synapse platform [3] which was still in early development at the point that the project ended, and I'm not sure yet if it is at a stage where it can meet your citation needs, but you could have a look (citation and attribution are of interest to SageBionetworks, hence the partnership in SageCite).
The Nature Genetics paper [4] that describes microattribution as a way to attribute observation of genetic variation may also be of interest? Open Network Biology has a focus of network-based models, and may be too specific for your needs, but is also worth a mention for more general interest in the types of solutions you are after.
Hope some of that is of interest - more generally, if not specifically to provide a solution to you.
Best wishes,
Monica
[1] Ball, A. & Duke, M. (2011). How to Cite Datasets and Link to Publications. DCC How-to Guides. Edinburgh: Digital Curation Centre.
http://www.dcc.ac.uk/resources/how-guides/cite-datasets
[2] http://blogs.ukoln.ac.uk/sagecite/
[3] https://synapse.sagebase.org/
[4] http://www.nature.com/ng/journal/v43/n4/full/ng.785.html
[5] http://www.opennetworkbiology.com/
On 3 Aug 2012, at 11:38, Brian Hole wrote:
Hi Alistair,
Just a quick note that we're in the process of expanding the data journal platform that the Journal of Open Archaeology Data Norman mentions is on, to include among other things data journals for epidemiology/public health, climatology, clinical trials, ecology and psychology. I'll send you an offline message regarding the epidemiology data journal, and would welcome contact from anyone else interested in these journals as well!
Best,
Brian
--
Brian Hole
Ubiquity Press Ltd.
www.ubiquitypress.com
www.twitter.com/ubiquitypress
--
On 3 August 2012 10:45, Hole, Brian <[log in to unmask]> wrote:
________________________________________
From: Norman Gray [[log in to unmask]]
Sent: 02 August 2012 17:22
To: Research Data Management discussion list
Cc: Hole, Brian
Subject: Re: advice on publishing and citing data
Alistair, hello.
On 2012 Aug 2, at 12:49, Alistair Miles wrote:
> I was hoping for some sort of data journal that allowed us to publish
> super-lightweight data papers that are basically an abstract, some
> metadata, and the dataset itself, and then gave us a DOI or some other
> persistent and trackable means of citation. But I couldn't find
> anything appropriate.
That more-or-less describes "Journal of Open Archaeology Data" <http://openarchaeologydata.metajnl.com/> (except that it's not in the right area).
The premise of this journal is that it publishes exactly the sort of lightweight data paper you're describing.
Brian Hole (cc-ed, in case he's on on this list) might know of analogous journals in the right area.
Best wishes,
Norman
--
Norman Gray : http://nxg.me.uk
SUPA School of Physics and Astronomy, University of Glasgow, UK
|