Hi all,
some of the existing German tools for DMPs were presented during a DINI/nestor workshop in May. This included the tool used at University Bielefeld (available only for students and researchers there as far as I know) and an open source tool created at TU Berlin. There is currently a project (involving the Leibniz Institute for Astrophysics and the State and University Library Göttingen among others) to create a generic tool for the German research landscape.
The slides from the workshop (in German) are available at http://www.forschungsdaten.org/index.php/DINI-nestor-WS2.
All best from Cologne
Astrid
--------
Dr. Astrid Recker, M.L.I.S.
GESIS - Leibniz Institute for the Social Sciences
International Data Infrastructures
Unter Sachsenhausen 6-8
D-50667 Köln
www.gesis.org, www.gesis.org/en/admtc
Tel: +49 (0) 221 47694 493
E-Mail: [log in to unmask]
@CESSDAtraining
------------------------------
Date: Thu, 13 Aug 2015 14:23:23 +0000
From: Anna Clements <[log in to unmask]>
Subject: Re: Data Management Planning tools
I love these and the approach - simple and direct - but still comprehensive. I assume also more intelligible & relevant to researcher than the templates we currently have from the different funders.
Suggest we lobby RCs to consider this approach -as the current situation is overly complex and engendering a fair amount of skepticism / push back from researchers.
Anna
______________________________________________________
Anna Clements | Assistant Director (Digital Research)
University of St Andrews Library | North Street | St Andrews | KY16 9TR|
T:01334 462761 | @AnnaKClements
________________________________
From: Research Data Management discussion list <[log in to unmask]> on behalf of Sarah Jones (HATII) <[log in to unmask]>
Sent: 13 August 2015 15:01
To: [log in to unmask]
Subject: Re: Data Management Planning tools
Thanks Chris
Yes, the questions are still there and are really useful. As is David's comparative analysis.
He put together a webform / tool to complete the 20 questions though too. That's what's down. From recollection it was a simple interface (akin to a google docs survey) with a couple of export options
All best
Sarah
________________________________
From: Research Data Management discussion list [[log in to unmask]] on behalf of Chris Rawlings [[log in to unmask]]
Sent: 13 August 2015 14:53
To: [log in to unmask]
Subject: Re: Data Management Planning tools
David Shotton’s questions are still on the DMP wordpress site here:
https://datamanagementplanning.wordpress.com/2012/03/07/twenty-questions-for-research-data-management/
From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Sarah Jones (HATII)
Sent: 13 August 2015 14:41
To: [log in to unmask]
Subject: Re: Data Management Planning tools
Hi Linda,
DMPonline and DMPTool are the two main generic ones. The Canadians are setting up a service called DMP Builder too, which is based on the code from DMPonline, see: https://dmp.library.ualberta.ca
I've come across a couple of others, but these are typically for one uni or discipline and aren't always open to others e.g.
- IEDA Data Management Plan tool http://www.iedadata.org/compliance/plan
- Manchester University DMP tool - http://www.library.manchester.ac.uk/services-and-support/staff/research/services/research-data-management/data-management-planning-tool
- I saw a demo of a DMP tool by Bielefeld University at RDA but I don't think this is published openly. I can't find a link anyhow
- David Shotton also had a webform for his 20 questions for a DMP but this now gives a 404 error http://www.miidi.org/dmp
All best
Sarah
________________________________
From: Research Data Management discussion list [[log in to unmask]] on behalf of Kerr, Linda [[log in to unmask]]
Sent: 13 August 2015 10:30
To: [log in to unmask]<mailto:[log in to unmask]>
Subject: Data Management Planning tools
Hello
We are refreshing our institutional advice on creating data management plans, in particular for EPSRC applicants, and will recommend a DMP tool to researchers. We know of the excellent DMPOnline, but I wonder if anyone reviewed other tools, so we can present options.
I found one for US projects, https://dmptool.org/, and some excellent support pages on the funders websites with checklists.
Is there a survey of tools anywhere, I wonder? Happy to summarise back to the list.
Regards
Linda
Linda Kerr
Research Support Librarian, Heriot-Watt University, Edinburgh, EH14 4AS
0131 451 3572
[log in to unmask]<mailto:[log in to unmask]>
http://www.hw.ac.uk/is/research-support.htm
New HEFCE Open Access website
http://www.hefce.ac.uk/rsrch/oa/FAQ/
Rothamsted Research is a company limited by guarantee, registered in England at Harpenden, Hertfordshire, AL5 2JQ under the registration number 2393175 and a not for profit charity number 802038.
------------------------------
Date: Thu, 13 Aug 2015 14:28:19 +0000
From: "Ligios, Linda" <[log in to unmask]>
Subject: Correction in PERICLES FP7 newsletter - August 2015
*Apologies for cross posting*
We have been notified of a scam conference mentioned in the August issue of PERICLES FP7 newsletter. We have now removed the event and listed the correct one. Please find our apologies for any inconvenience. Here is the new link: http://eepurl.com/bvYV81 <http://t.co/koeRdpxVA1>
We hope you enjoy our new issue which focuses on the work done in the area of ontologies and ecosystem modelling.
PERICLES<http://pericles-project.eu/main> is a four-year Project (2013-2017) funded by the European Union which aims to address the challenge of ensuring that digital content remains accessible in an environment that is subject to continual change.
Kind regards,
Linda Ligios
EU Communications Coordinator, PERICLES
Department of Digital Humanities
King's College London
Email: [log in to unmask]<mailto:[log in to unmask]>
http://pericles-project.eu/
------------------------------
Date: Thu, 13 Aug 2015 14:53:59 +0000
From: Andrew MacLellan <[log in to unmask]>
Subject: Re: Anonymised and non-anonymised datasets
Thanks to Rachael and Lucy, that’s helpful for me. It makes sense that the ability to cite data unambiguously should be prioritised.
One small follow on query though: would it be problematic if this method of creating separate datasets with separate DOI’s was routinely carried out by a researcher, and then that researcher would appear to have deposited twice as many distinct datasets as they actually have? I can imagine this causing headaches for Universities trying to measure and reward data sharing. Is there an easy work-around for this?
Thanks,
Andrew
Andrew Maclellan
Research Data Support Officer | Research Data Management and Sharing
Research and Knowledge Exchange Services
University of Strathclyde, Graham Hills Building, 50 George Street, Glasgow, G1 1QE
Tel: 0141 548 4581
Email: [log in to unmask]<mailto:[log in to unmask]>
From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of datasets
Sent: 12 August 2015 17:18
To: [log in to unmask]
Subject: Re: Anonymised and non-anonymised datasets
Nicola,
I would also recommend the UK Data Service approach here. There is no problem with having two datasets that are separately cite-able with separate DOIs even if there is a large amount of overlap – the small area without overlap can create a large difference in analysis of the two sets of data.
But if this wasn’t technically possible in your system, and you were only able to assign one DOI for some reason, I think that the DOI and so the metadata you provide would ideally describe the full dataset that also includes the sensitive data. I say that because I would see the more freely available anonymised data as a sub-set of the full dataset - the full dataset being the available data plus the identifying information. It would then be for citing authors to highlight the subset of the data they actually used (whether they would or not in reality is the reason having two DOIs would be a better approach). It would be trickier for a citing author who used the wider set if it was the other way around.
I caveat the last para stating that those are my views, not official DataCite guidance!
Thanks, Rachael.
Rachael Kotarski
Data Services and Content Lead
The British Library, 96 Euston Road, London NW1 2DB
Tel: 020 7412 7167 | Email: [log in to unmask]<mailto:[log in to unmask]>
| Datasets@BL<http://www.bl.uk/datasets> | DataCite<http://www.datacite.org/> | Twitter<http://twitter.com/DataCiteUK> |
From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Johnson, Lucy A
Sent: 12 August 2015 16:57
To: [log in to unmask]<mailto:[log in to unmask]>
Subject: Re: Anonymised and non-anonymised datasets
Hi Nicola
Hope I can help with this one.
Here in the UK Data Service we do just that – have two DOIs if the dataset has been changed in some way. Our thinking is that dataset a which contains the open access content is different to dataset b which contains additional, sensitive material. If a researcher wanted to trace back the data that had been cited in a paper somewhere, they want to know which of these two datasets they came from. Hence the need for two DOIs.
Here is an example of this in action:
Quarterly Labour Force Survey, January – March 2015 (http://discover.ukdataservice.ac.uk/catalogue/?sn=7725), DOI = 10.5255/UKDA-SN-7725-1
Quarterly Labour Force Survey, January – March 2015: Special Licence Access (http://discover.ukdataservice.ac.uk/catalogue/?sn=7726), DOI = 10.5255/UKDA-SN-7726-1
The latter contains extra variables and hence is subject to more restrictive access conditions. There are other examples of this in our catalogue, moving along the spectrum of access, into secure/controlled as well.
Hope that helps,
Lucy
___________________________________
Lucy Johnson
Functional Director, Data Access
___________________________________
T +44(0) 1206 872008
E [log in to unmask]<mailto:[log in to unmask]>
W ukdataservice.ac.uk
___________________________________
UK Data Service
UK Data Archive
University of Essex
___________________________________
Legal Disclaimer: Any views expressed by the sender of this message
are not necessarily those of the UK Data Service or the UK Data Archive.
This email and any files with it are confidential and intended solely for
the use of the individual(s) or entity to whom they are addressed.
From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Nicola Dawson
Sent: 12 August 2015 16:21
To: [log in to unmask]<mailto:[log in to unmask]>
Subject: Re: Anonymised and non-anonymised datasets
Thanks for the responses – Kate, I particularly liked the way you’ve set out your dataset information, it’s really clear and easy to use.
Does anyone out there have any thoughts or experience in creating more than one DOI for a dataset just in case this might be a better way forward (although I currently think option B is the way to go!)
Regards
Nicola
From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Katherine McNeill
Sent: 11 August 2015 18:26
To: [log in to unmask]<mailto:[log in to unmask]>
Subject: Re: Anonymised and non-anonymised datasets
Nicola,
I can share an example of model B in action for you. It might be the same with other repositories, but model B that you described is the one used by the ICPSR social science data archive (essentially the UK Data Service of the U.S.). For those studies that have restricted sets of data, there’s a note to that effect and instructions for requesting access. E.g., this study http://doi.org/10.3886/ICPSR34314.v3 has a note near the top entitled Access Notes.
Sincerely,
Kate McNeill
___________________________________
Katherine McNeill<http://libguides.mit.edu/profiles/mcneillh>
Program Head, Data Management Services
Massachusetts Institute of Technology
[log in to unmask]<mailto:[log in to unmask]> | 617-253-0787
Data Management Services<http://libraries.mit.edu/data-management>
From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Andrew MacLellan
Sent: Tuesday, August 11, 2015 12:09 PM
To: [log in to unmask]<mailto:[log in to unmask]>
Subject: Re: Anonymised and non-anonymised datasets
Hi Nicola,
Assuming the participants had given consent for personal data to be shared under non-disclosure agreements only, and that there is some kind of significant value to the personal data, I would go with option B. It depends a bit on the dataset, but I think typically, an anonymised dataset is sufficient for most purposes.
If this is a situation where there is clear value in being able to identify the participants or other people discussed in the interviews, and it’s likely that there will be requests to access the personal data, then I suppose it might make sense to go for option A. I’m not a DOI expert though so perhaps someone else on the list would have something to say about creating separate DOI’s for such similar datasets.
I don’t fully understand option C so won’t comment on that.
Hope that helps,
Andrew
Andrew Maclellan
Research Data Support Officer | Research Data Management and Sharing
Research and Knowledge Exchange Services
University of Strathclyde, Graham Hills Building, 50 George Street, Glasgow, G1 1QE
Tel: 0141 548 4581
Email: [log in to unmask]<mailto:[log in to unmask]>
From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Nicola Dawson
Sent: 11 August 2015 16:26
To: [log in to unmask]<mailto:[log in to unmask]>
Subject: Anonymised and non-anonymised datasets
Dear All
We have just been speaking to a researcher who wants to publish a dataset which has a number of different file-types including some interview transcripts. He has two versions of the dataset -
1 contains personal data within the interview transcripts – this version of the dataset could be shared subject to a contractual non-disclosure agreement
2 contains all the same data, but the interview transcripts have been anonymised –this version of the dataset could be shared under a creative commons licence
We are currently considering the following options:
a) Create two versions of the “dataset description” with two separate DOIs – one with open access, the other requiring contractual terms to be discussed to allow release
b) Make public only the version of the dataset with the anonymised data, with a note in the description that external researchers should contact the University separately to request access to the version containing personal data and deal with it manually
c) Come up with some kind of technical solution/change to our system to allow us to give two options to the requestor (and try to find some clever technical way of linking to the different files) however this might be quite a lot of work for something that might not happen regularly
I wondered whether anyone else had come across this issue and had a good solution for how to manage it?
Many thanks
Nicola
Nicola Dawson
Business Change Manager
Research Data and Information Management
University IT Services
Cardiff University
39 Park Place
Cardiff
CF10 3BB
Tel: +44(0)29 2087 5891
Email: [log in to unmask]<mailto:[log in to unmask]>
Nicola Dawson
Rheolwr Newid Busnes
Rheoli Data a Gwybodaeth Ymchwil
Gwasanaethau TG y Brifysgol
Prifysgol Caerdydd
39 Plas y Parc
Caerdydd
CF10 3BB
Ffôn : +44(0)29 2087 5891
Ebost: [log in to unmask]<mailto:[log in to unmask]>
******************************************************************************************************************
Experience the British Library online at www.bl.uk<http://www.bl.uk/>
The British Library’s latest Annual Report and Accounts : www.bl.uk/aboutus/annrep/index.html<http://www.bl.uk/aboutus/annrep/index.html>
Help the British Library conserve the world's knowledge. Adopt a Book. www.bl.uk/adoptabook<http://www.bl.uk/adoptabook>
The Library's St Pancras site is WiFi - enabled
*****************************************************************************************************************
The information contained in this e-mail is confidential and may be legally privileged. It is intended for the addressee(s) only. If you are not the intended recipient, please delete this e-mail and notify the [log in to unmask]<mailto:[log in to unmask]> : The contents of this e-mail must not be disclosed or copied without the sender's consent.
The statements and opinions expressed in this message are those of the author and do not necessarily reflect those of the British Library. The British Library does not take any responsibility for the views of the author.
*****************************************************************************************************************
Think before you print
------------------------------
Date: Thu, 13 Aug 2015 15:08:04 +0000
From: "Johnson, Lucy A" <[log in to unmask]>
Subject: Re: Anonymised and non-anonymised datasets
Hi Andrew
I suppose you could argue that perhaps the two datasets should be counted twice, if one has had more value added to it (rather than one having had variables removed)? But here at the UK Data Service, we do link these sorts of datasets together via their metadata so that any connections are transparent. The metadata for one will reference the other.
A related initiative that you might find interesting is IRUS. Have you seen this? There is more information here: http://www.irus.mimas.ac.uk/. It is a metrics service which counts downloaded content, including datasets, from participating UK institutional repositories. The caveat is that these two datasets would probably still be counted separately here!
Best
Lucy
From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Andrew MacLellan
Sent: 13 August 2015 15:54
To: [log in to unmask]
Subject: Re: Anonymised and non-anonymised datasets
Thanks to Rachael and Lucy, that’s helpful for me. It makes sense that the ability to cite data unambiguously should be prioritised.
One small follow on query though: would it be problematic if this method of creating separate datasets with separate DOI’s was routinely carried out by a researcher, and then that researcher would appear to have deposited twice as many distinct datasets as they actually have? I can imagine this causing headaches for Universities trying to measure and reward data sharing. Is there an easy work-around for this?
Thanks,
Andrew
Andrew Maclellan
Research Data Support Officer | Research Data Management and Sharing
Research and Knowledge Exchange Services
University of Strathclyde, Graham Hills Building, 50 George Street, Glasgow, G1 1QE
Tel: 0141 548 4581
Email: [log in to unmask]<mailto:[log in to unmask]>
From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of datasets
Sent: 12 August 2015 17:18
To: [log in to unmask]<mailto:[log in to unmask]>
Subject: Re: Anonymised and non-anonymised datasets
Nicola,
I would also recommend the UK Data Service approach here. There is no problem with having two datasets that are separately cite-able with separate DOIs even if there is a large amount of overlap – the small area without overlap can create a large difference in analysis of the two sets of data.
But if this wasn’t technically possible in your system, and you were only able to assign one DOI for some reason, I think that the DOI and so the metadata you provide would ideally describe the full dataset that also includes the sensitive data. I say that because I would see the more freely available anonymised data as a sub-set of the full dataset - the full dataset being the available data plus the identifying information. It would then be for citing authors to highlight the subset of the data they actually used (whether they would or not in reality is the reason having two DOIs would be a better approach). It would be trickier for a citing author who used the wider set if it was the other way around.
I caveat the last para stating that those are my views, not official DataCite guidance!
Thanks, Rachael.
Rachael Kotarski
Data Services and Content Lead
The British Library, 96 Euston Road, London NW1 2DB
Tel: 020 7412 7167 | Email: [log in to unmask]<mailto:[log in to unmask]>
| Datasets@BL<http://www.bl.uk/datasets> | DataCite<http://www.datacite.org/> | Twitter<http://twitter.com/DataCiteUK> |
From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Johnson, Lucy A
Sent: 12 August 2015 16:57
To: [log in to unmask]<mailto:[log in to unmask]>
Subject: Re: Anonymised and non-anonymised datasets
Hi Nicola
Hope I can help with this one.
Here in the UK Data Service we do just that – have two DOIs if the dataset has been changed in some way. Our thinking is that dataset a which contains the open access content is different to dataset b which contains additional, sensitive material. If a researcher wanted to trace back the data that had been cited in a paper somewhere, they want to know which of these two datasets they came from. Hence the need for two DOIs.
Here is an example of this in action:
Quarterly Labour Force Survey, January – March 2015 (http://discover.ukdataservice.ac.uk/catalogue/?sn=7725), DOI = 10.5255/UKDA-SN-7725-1
Quarterly Labour Force Survey, January – March 2015: Special Licence Access (http://discover.ukdataservice.ac.uk/catalogue/?sn=7726), DOI = 10.5255/UKDA-SN-7726-1
The latter contains extra variables and hence is subject to more restrictive access conditions. There are other examples of this in our catalogue, moving along the spectrum of access, into secure/controlled as well.
Hope that helps,
Lucy
___________________________________
Lucy Johnson
Functional Director, Data Access
___________________________________
T +44(0) 1206 872008
E [log in to unmask]<mailto:[log in to unmask]>
W ukdataservice.ac.uk
___________________________________
UK Data Service
UK Data Archive
University of Essex
___________________________________
Legal Disclaimer: Any views expressed by the sender of this message
are not necessarily those of the UK Data Service or the UK Data Archive.
This email and any files with it are confidential and intended solely for
the use of the individual(s) or entity to whom they are addressed.
From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Nicola Dawson
Sent: 12 August 2015 16:21
To: [log in to unmask]<mailto:[log in to unmask]>
Subject: Re: Anonymised and non-anonymised datasets
Thanks for the responses – Kate, I particularly liked the way you’ve set out your dataset information, it’s really clear and easy to use.
Does anyone out there have any thoughts or experience in creating more than one DOI for a dataset just in case this might be a better way forward (although I currently think option B is the way to go!)
Regards
Nicola
From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Katherine McNeill
Sent: 11 August 2015 18:26
To: [log in to unmask]<mailto:[log in to unmask]>
Subject: Re: Anonymised and non-anonymised datasets
Nicola,
I can share an example of model B in action for you. It might be the same with other repositories, but model B that you described is the one used by the ICPSR social science data archive (essentially the UK Data Service of the U.S.). For those studies that have restricted sets of data, there’s a note to that effect and instructions for requesting access. E.g., this study http://doi.org/10.3886/ICPSR34314.v3 has a note near the top entitled Access Notes.
Sincerely,
Kate McNeill
___________________________________
Katherine McNeill<http://libguides.mit.edu/profiles/mcneillh>
Program Head, Data Management Services
Massachusetts Institute of Technology
[log in to unmask]<mailto:[log in to unmask]> | 617-253-0787
Data Management Services<http://libraries.mit.edu/data-management>
From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Andrew MacLellan
Sent: Tuesday, August 11, 2015 12:09 PM
To: [log in to unmask]<mailto:[log in to unmask]>
Subject: Re: Anonymised and non-anonymised datasets
Hi Nicola,
Assuming the participants had given consent for personal data to be shared under non-disclosure agreements only, and that there is some kind of significant value to the personal data, I would go with option B. It depends a bit on the dataset, but I think typically, an anonymised dataset is sufficient for most purposes.
If this is a situation where there is clear value in being able to identify the participants or other people discussed in the interviews, and it’s likely that there will be requests to access the personal data, then I suppose it might make sense to go for option A. I’m not a DOI expert though so perhaps someone else on the list would have something to say about creating separate DOI’s for such similar datasets.
I don’t fully understand option C so won’t comment on that.
Hope that helps,
Andrew
Andrew Maclellan
Research Data Support Officer | Research Data Management and Sharing
Research and Knowledge Exchange Services
University of Strathclyde, Graham Hills Building, 50 George Street, Glasgow, G1 1QE
Tel: 0141 548 4581
Email: [log in to unmask]<mailto:[log in to unmask]>
From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Nicola Dawson
Sent: 11 August 2015 16:26
To: [log in to unmask]<mailto:[log in to unmask]>
Subject: Anonymised and non-anonymised datasets
Dear All
We have just been speaking to a researcher who wants to publish a dataset which has a number of different file-types including some interview transcripts. He has two versions of the dataset -
1 contains personal data within the interview transcripts – this version of the dataset could be shared subject to a contractual non-disclosure agreement
2 contains all the same data, but the interview transcripts have been anonymised –this version of the dataset could be shared under a creative commons licence
We are currently considering the following options:
a) Create two versions of the “dataset description” with two separate DOIs – one with open access, the other requiring contractual terms to be discussed to allow release
b) Make public only the version of the dataset with the anonymised data, with a note in the description that external researchers should contact the University separately to request access to the version containing personal data and deal with it manually
c) Come up with some kind of technical solution/change to our system to allow us to give two options to the requestor (and try to find some clever technical way of linking to the different files) however this might be quite a lot of work for something that might not happen regularly
I wondered whether anyone else had come across this issue and had a good solution for how to manage it?
Many thanks
Nicola
Nicola Dawson
Business Change Manager
Research Data and Information Management
University IT Services
Cardiff University
39 Park Place
Cardiff
CF10 3BB
Tel: +44(0)29 2087 5891
Email: [log in to unmask]<mailto:[log in to unmask]>
Nicola Dawson
Rheolwr Newid Busnes
Rheoli Data a Gwybodaeth Ymchwil
Gwasanaethau TG y Brifysgol
Prifysgol Caerdydd
39 Plas y Parc
Caerdydd
CF10 3BB
Ffôn : +44(0)29 2087 5891
Ebost: [log in to unmask]<mailto:[log in to unmask]>
******************************************************************************************************************
Experience the British Library online at www.bl.uk<http://www.bl.uk/>
The British Library’s latest Annual Report and Accounts : www.bl.uk/aboutus/annrep/index.html<http://www.bl.uk/aboutus/annrep/index.html>
Help the British Library conserve the world's knowledge. Adopt a Book. www.bl.uk/adoptabook<http://www.bl.uk/adoptabook>
The Library's St Pancras site is WiFi - enabled
*****************************************************************************************************************
The information contained in this e-mail is confidential and may be legally privileged. It is intended for the addressee(s) only. If you are not the intended recipient, please delete this e-mail and notify the [log in to unmask]<mailto:[log in to unmask]> : The contents of this e-mail must not be disclosed or copied without the sender's consent.
The statements and opinions expressed in this message are those of the author and do not necessarily reflect those of the British Library. The British Library does not take any responsibility for the views of the author.
*****************************************************************************************************************
Think before you print
------------------------------
Date: Thu, 13 Aug 2015 15:26:26 +0000
From: James Davenport <[log in to unmask]>
Subject: Re: Anonymised and non-anonymised datasets
Is this more of a problem than the researcher who publishes an "extended abstract" and then the journal paper, or the researcher who publishes the same paper in multiple languages?
James Davenport
National Teaching Fellow 2014
Hebron & Medlock Professor of Information Technology, University of Bath
OpenMath Content Dictionary Editor
Director of Studies EPSRC Doctoral Taught Course Centre for HPC
Chair, IMU Committee on Electronic Information and Communication
Vice-President and Academy Trustee, British Computer Society
________________________________
From: Research Data Management discussion list <[log in to unmask]> on behalf of Andrew MacLellan <[log in to unmask]>
Sent: Thursday, August 13, 2015 3:53 PM
To: [log in to unmask]
Subject: Re: Anonymised and non-anonymised datasets
Thanks to Rachael and Lucy, that’s helpful for me. It makes sense that the ability to cite data unambiguously should be prioritised.
One small follow on query though: would it be problematic if this method of creating separate datasets with separate DOI’s was routinely carried out by a researcher, and then that researcher would appear to have deposited twice as many distinct datasets as they actually have? I can imagine this causing headaches for Universities trying to measure and reward data sharing. Is there an easy work-around for this?
Thanks,
Andrew
Andrew Maclellan
Research Data Support Officer | Research Data Management and Sharing
Research and Knowledge Exchange Services
University of Strathclyde, Graham Hills Building, 50 George Street, Glasgow, G1 1QE
Tel: 0141 548 4581
Email: [log in to unmask]<mailto:[log in to unmask]>
From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of datasets
Sent: 12 August 2015 17:18
To: [log in to unmask]
Subject: Re: Anonymised and non-anonymised datasets
Nicola,
I would also recommend the UK Data Service approach here. There is no problem with having two datasets that are separately cite-able with separate DOIs even if there is a large amount of overlap – the small area without overlap can create a large difference in analysis of the two sets of data.
But if this wasn’t technically possible in your system, and you were only able to assign one DOI for some reason, I think that the DOI and so the metadata you provide would ideally describe the full dataset that also includes the sensitive data. I say that because I would see the more freely available anonymised data as a sub-set of the full dataset - the full dataset being the available data plus the identifying information. It would then be for citing authors to highlight the subset of the data they actually used (whether they would or not in reality is the reason having two DOIs would be a better approach). It would be trickier for a citing author who used the wider set if it was the other way around.
I caveat the last para stating that those are my views, not official DataCite guidance!
Thanks, Rachael.
Rachael Kotarski
Data Services and Content Lead
The British Library, 96 Euston Road, London NW1 2DB
Tel: 020 7412 7167 | Email: [log in to unmask]<mailto:[log in to unmask]>
| Datasets@BL<http://www.bl.uk/datasets> | DataCite<http://www.datacite.org/> | Twitter<http://twitter.com/DataCiteUK> |
From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Johnson, Lucy A
Sent: 12 August 2015 16:57
To: [log in to unmask]<mailto:[log in to unmask]>
Subject: Re: Anonymised and non-anonymised datasets
Hi Nicola
Hope I can help with this one.
Here in the UK Data Service we do just that – have two DOIs if the dataset has been changed in some way. Our thinking is that dataset a which contains the open access content is different to dataset b which contains additional, sensitive material. If a researcher wanted to trace back the data that had been cited in a paper somewhere, they want to know which of these two datasets they came from. Hence the need for two DOIs.
Here is an example of this in action:
Quarterly Labour Force Survey, January – March 2015 (http://discover.ukdataservice.ac.uk/catalogue/?sn=7725), DOI = 10.5255/UKDA-SN-7725-1
Quarterly Labour Force Survey, January – March 2015: Special Licence Access (http://discover.ukdataservice.ac.uk/catalogue/?sn=7726), DOI = 10.5255/UKDA-SN-7726-1
The latter contains extra variables and hence is subject to more restrictive access conditions. There are other examples of this in our catalogue, moving along the spectrum of access, into secure/controlled as well.
Hope that helps,
Lucy
___________________________________
Lucy Johnson
Functional Director, Data Access
___________________________________
T +44(0) 1206 872008
E [log in to unmask]<mailto:[log in to unmask]>
W ukdataservice.ac.uk
___________________________________
UK Data Service
UK Data Archive
University of Essex
___________________________________
Legal Disclaimer: Any views expressed by the sender of this message
are not necessarily those of the UK Data Service or the UK Data Archive.
This email and any files with it are confidential and intended solely for
the use of the individual(s) or entity to whom they are addressed.
From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Nicola Dawson
Sent: 12 August 2015 16:21
To: [log in to unmask]<mailto:[log in to unmask]>
Subject: Re: Anonymised and non-anonymised datasets
Thanks for the responses – Kate, I particularly liked the way you’ve set out your dataset information, it’s really clear and easy to use.
Does anyone out there have any thoughts or experience in creating more than one DOI for a dataset just in case this might be a better way forward (although I currently think option B is the way to go!)
Regards
Nicola
From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Katherine McNeill
Sent: 11 August 2015 18:26
To: [log in to unmask]<mailto:[log in to unmask]>
Subject: Re: Anonymised and non-anonymised datasets
Nicola,
I can share an example of model B in action for you. It might be the same with other repositories, but model B that you described is the one used by the ICPSR social science data archive (essentially the UK Data Service of the U.S.). For those studies that have restricted sets of data, there’s a note to that effect and instructions for requesting access. E.g., this study http://doi.org/10.3886/ICPSR34314.v3 has a note near the top entitled Access Notes.
Sincerely,
Kate McNeill
___________________________________
Katherine McNeill<http://libguides.mit.edu/profiles/mcneillh>
Program Head, Data Management Services
Massachusetts Institute of Technology
[log in to unmask]<mailto:[log in to unmask]> | 617-253-0787
Data Management Services<http://libraries.mit.edu/data-management>
From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Andrew MacLellan
Sent: Tuesday, August 11, 2015 12:09 PM
To: [log in to unmask]<mailto:[log in to unmask]>
Subject: Re: Anonymised and non-anonymised datasets
Hi Nicola,
Assuming the participants had given consent for personal data to be shared under non-disclosure agreements only, and that there is some kind of significant value to the personal data, I would go with option B. It depends a bit on the dataset, but I think typically, an anonymised dataset is sufficient for most purposes.
If this is a situation where there is clear value in being able to identify the participants or other people discussed in the interviews, and it’s likely that there will be requests to access the personal data, then I suppose it might make sense to go for option A. I’m not a DOI expert though so perhaps someone else on the list would have something to say about creating separate DOI’s for such similar datasets.
I don’t fully understand option C so won’t comment on that.
Hope that helps,
Andrew
Andrew Maclellan
Research Data Support Officer | Research Data Management and Sharing
Research and Knowledge Exchange Services
University of Strathclyde, Graham Hills Building, 50 George Street, Glasgow, G1 1QE
Tel: 0141 548 4581
Email: [log in to unmask]<mailto:[log in to unmask]>
From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Nicola Dawson
Sent: 11 August 2015 16:26
To: [log in to unmask]<mailto:[log in to unmask]>
Subject: Anonymised and non-anonymised datasets
Dear All
We have just been speaking to a researcher who wants to publish a dataset which has a number of different file-types including some interview transcripts. He has two versions of the dataset -
1 contains personal data within the interview transcripts – this version of the dataset could be shared subject to a contractual non-disclosure agreement
2 contains all the same data, but the interview transcripts have been anonymised –this version of the dataset could be shared under a creative commons licence
We are currently considering the following options:
a) Create two versions of the “dataset description” with two separate DOIs – one with open access, the other requiring contractual terms to be discussed to allow release
b) Make public only the version of the dataset with the anonymised data, with a note in the description that external researchers should contact the University separately to request access to the version containing personal data and deal with it manually
c) Come up with some kind of technical solution/change to our system to allow us to give two options to the requestor (and try to find some clever technical way of linking to the different files) however this might be quite a lot of work for something that might not happen regularly
I wondered whether anyone else had come across this issue and had a good solution for how to manage it?
Many thanks
Nicola
Nicola Dawson
Business Change Manager
Research Data and Information Management
University IT Services
Cardiff University
39 Park Place
Cardiff
CF10 3BB
Tel: +44(0)29 2087 5891
Email: [log in to unmask]<mailto:[log in to unmask]>
Nicola Dawson
Rheolwr Newid Busnes
Rheoli Data a Gwybodaeth Ymchwil
Gwasanaethau TG y Brifysgol
Prifysgol Caerdydd
39 Plas y Parc
Caerdydd
CF10 3BB
Ffôn : +44(0)29 2087 5891
Ebost: [log in to unmask]<mailto:[log in to unmask]>
******************************************************************************************************************
Experience the British Library online at www.bl.uk<http://www.bl.uk/>
The British Library’s latest Annual Report and Accounts : www.bl.uk/aboutus/annrep/index.html<http://www.bl.uk/aboutus/annrep/index.html>
Help the British Library conserve the world's knowledge. Adopt a Book. www.bl.uk/adoptabook<http://www.bl.uk/adoptabook>
The Library's St Pancras site is WiFi - enabled
*****************************************************************************************************************
The information contained in this e-mail is confidential and may be legally privileged. It is intended for the addressee(s) only. If you are not the intended recipient, please delete this e-mail and notify the [log in to unmask]<mailto:[log in to unmask]> : The contents of this e-mail must not be disclosed or copied without the sender's consent.
The statements and opinions expressed in this message are those of the author and do not necessarily reflect those of the British Library. The British Library does not take any responsibility for the views of the author.
*****************************************************************************************************************
Think before you print
------------------------------
End of RESEARCH-DATAMAN Digest - 12 Aug 2015 to 13 Aug 2015 (#2015-28)
**********************************************************************
|