JISCMail - RESEARCH-DATAMAN Archives

Email discussion lists for the UK Education and Research communities
Subscriber's Corner
Email Lists
RESEARCH-DATAMAN Archives

RESEARCH-DATAMAN@JISCMAIL.AC.UK

View:

Message:
[
First
Last
]
By Topic:
[
First
Last
]
By Author:
[
First
Last
]
Font:
Proportional Font
		LISTSERV Archives
		RESEARCH-DATAMAN Home
		RESEARCH-DATAMAN August 2015
Options

Subscribe or Unsubscribe
Get Password
Subject:
Re: Data Management Planning tools
From:
"Recker, Astrid" <[log in to unmask]>
Reply-To:
Research Data Management discussion list <[log in to unmask]>
Date:
Fri, 14 Aug 2015 06:32:49 +0000
Content-Type:
text/plain
Parts/Attachments:
text/plain (1 lines)
Hi all,



some of the existing German tools for DMPs were presented during a DINI/nestor workshop in May. This included the tool used at University Bielefeld (available only for students and researchers there as far as I know) and an open source tool created at TU Berlin. There is currently a project (involving the Leibniz Institute for Astrophysics and the State and University Library Göttingen among others) to create a generic tool for the German research landscape. 



The slides from the workshop (in German) are available at http://www.forschungsdaten.org/index.php/DINI-nestor-WS2.



All best from Cologne

Astrid





--------

Dr. Astrid Recker, M.L.I.S.

GESIS - Leibniz Institute for the Social Sciences

International Data Infrastructures

Unter Sachsenhausen 6-8 

D-50667 Köln

www.gesis.org, www.gesis.org/en/admtc

Tel: +49 (0) 221 47694 493

E-Mail: [log in to unmask] 

@CESSDAtraining







------------------------------



Date:    Thu, 13 Aug 2015 14:23:23 +0000

From:    Anna Clements <[log in to unmask]>

Subject: Re: Data Management Planning tools



I love these and the approach - simple and direct -  but still comprehensive.  I assume also more intelligible & relevant to researcher than the templates we currently have from the different funders.





Suggest we lobby RCs  to consider this  approach -as the current situation is overly complex and engendering a fair amount of skepticism / push back from researchers.





Anna







______________________________________________________

Anna Clements | Assistant Director (Digital Research)



University of St Andrews Library | North Street | St Andrews | KY16 9TR|

T:01334 462761 | @AnnaKClements





________________________________

From: Research Data Management discussion list <[log in to unmask]> on behalf of Sarah Jones (HATII) <[log in to unmask]>

Sent: 13 August 2015 15:01

To: [log in to unmask]

Subject: Re: Data Management Planning tools



Thanks Chris



Yes, the questions are still there and are really useful. As is David's comparative analysis.



He put together a webform / tool to complete the 20 questions though too. That's what's down. From recollection it was a simple interface (akin to a google docs survey) with a couple of export options



All best



Sarah

________________________________

From: Research Data Management discussion list [[log in to unmask]] on behalf of Chris Rawlings [[log in to unmask]]

Sent: 13 August 2015 14:53

To: [log in to unmask]

Subject: Re: Data Management Planning tools





David Shotton’s questions are still on the DMP wordpress site here:



https://datamanagementplanning.wordpress.com/2012/03/07/twenty-questions-for-research-data-management/







From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Sarah Jones (HATII)

Sent: 13 August 2015 14:41

To: [log in to unmask]

Subject: Re: Data Management Planning tools







Hi Linda,







DMPonline and DMPTool are the two main generic ones. The Canadians are setting up a service called DMP Builder too, which is based on the code from DMPonline, see: https://dmp.library.ualberta.ca







I've come across a couple of others, but these are typically for one uni or discipline and aren't always open to others e.g.







- IEDA Data Management Plan tool http://www.iedadata.org/compliance/plan



- Manchester University DMP tool - http://www.library.manchester.ac.uk/services-and-support/staff/research/services/research-data-management/data-management-planning-tool



- I saw a demo of a DMP tool by Bielefeld University at RDA but I don't think this is published openly. I can't find a link anyhow



- David Shotton also had a webform for his 20 questions for a DMP but this now gives a 404 error http://www.miidi.org/dmp







All best







Sarah



________________________________



From: Research Data Management discussion list [[log in to unmask]] on behalf of Kerr, Linda [[log in to unmask]]

Sent: 13 August 2015 10:30

To: [log in to unmask]<mailto:[log in to unmask]>

Subject: Data Management Planning tools



Hello







We are refreshing our institutional advice on creating data management plans, in particular for EPSRC applicants, and will recommend a DMP tool to researchers.   We know of the excellent DMPOnline, but I wonder if anyone reviewed other tools, so we can present options.







I found one for US projects, https://dmptool.org/, and some excellent support pages on the funders websites with checklists.







Is there a survey of tools anywhere, I wonder?  Happy to summarise back to the list.







Regards







Linda











Linda Kerr



Research Support Librarian, Heriot-Watt University, Edinburgh, EH14 4AS



0131 451 3572



[log in to unmask]<mailto:[log in to unmask]>







http://www.hw.ac.uk/is/research-support.htm







New HEFCE Open Access website



http://www.hefce.ac.uk/rsrch/oa/FAQ/







Rothamsted Research is a company limited by guarantee, registered in England at Harpenden, Hertfordshire, AL5 2JQ under the registration number 2393175 and a not for profit charity number 802038.



------------------------------



Date:    Thu, 13 Aug 2015 14:28:19 +0000

From:    "Ligios, Linda" <[log in to unmask]>

Subject: Correction in PERICLES FP7 newsletter - August 2015



*Apologies for cross posting*



We have been notified of a scam conference mentioned in the August issue of PERICLES FP7 newsletter. We have now removed the event and listed the correct one. Please find our apologies for any inconvenience. Here is the new link: http://eepurl.com/bvYV81 <http://t.co/koeRdpxVA1>

We hope you enjoy our new issue which focuses on the work done in the area of ontologies and ecosystem modelling.



PERICLES<http://pericles-project.eu/main> is a four-year Project (2013-2017) funded by the European Union which aims to address the challenge of ensuring that digital content remains accessible in an environment that is subject to continual change.



Kind regards,



Linda Ligios

EU Communications Coordinator, PERICLES



Department of Digital Humanities

King's College London

Email: [log in to unmask]<mailto:[log in to unmask]>

http://pericles-project.eu/



------------------------------



Date:    Thu, 13 Aug 2015 14:53:59 +0000

From:    Andrew MacLellan <[log in to unmask]>

Subject: Re: Anonymised and non-anonymised datasets



Thanks to Rachael and Lucy, that’s helpful for me. It makes sense that the ability to cite data unambiguously should be prioritised.



One small follow on query though: would it be problematic if this method of creating separate datasets with separate DOI’s was routinely carried out by a researcher, and then that researcher would appear to have deposited twice as many distinct datasets as they actually have? I can imagine this causing headaches for Universities trying to measure and reward data sharing. Is there an easy work-around for this?



Thanks,

Andrew



Andrew Maclellan

Research Data Support Officer | Research Data Management and Sharing

Research and Knowledge Exchange Services

University of Strathclyde, Graham Hills Building, 50 George Street, Glasgow, G1 1QE

Tel: 0141 548 4581

Email: [log in to unmask]<mailto:[log in to unmask]>





From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of datasets

Sent: 12 August 2015 17:18

To: [log in to unmask]

Subject: Re: Anonymised and non-anonymised datasets



Nicola,



I would also recommend the UK Data Service approach here. There is no problem with having two datasets that are separately cite-able with separate DOIs even if there is a large amount of overlap – the small area without overlap can create a large difference in analysis of the two sets of data.



But if this wasn’t technically possible in your system, and you were only able to assign one DOI for some reason, I think that the DOI and so the metadata you provide would ideally describe the full dataset that also includes the sensitive data. I say that because I would see the more freely available anonymised data as a sub-set of the full dataset - the full dataset being the available data plus the identifying information. It would then be for citing authors to highlight the subset of the data they actually used (whether they would or not in reality is the reason having two DOIs would be a better approach). It would be trickier for a citing author who used the wider set if it was the other way around.



I caveat the last para stating that those are my views, not official DataCite guidance!



Thanks, Rachael.





Rachael Kotarski

Data Services and Content Lead

The British Library, 96 Euston Road, London NW1 2DB



Tel: 020 7412 7167 | Email: [log in to unmask]<mailto:[log in to unmask]>



|  Datasets@BL<http://www.bl.uk/datasets>  |  DataCite<http://www.datacite.org/>  |  Twitter<http://twitter.com/DataCiteUK>  |







From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Johnson, Lucy A

Sent: 12 August 2015 16:57

To: [log in to unmask]<mailto:[log in to unmask]>

Subject: Re: Anonymised and non-anonymised datasets



Hi Nicola



Hope I can help with this one.



Here in the UK Data Service we do just that – have two DOIs if the dataset has been changed in some way.  Our thinking is that dataset a which contains the open access content is different to dataset b which contains additional, sensitive material.  If a researcher wanted to trace back the data that had been cited in a paper somewhere, they want to know which of these two datasets they came from.  Hence the need for two DOIs.



Here is an example of this in action:



Quarterly Labour Force Survey, January – March 2015 (http://discover.ukdataservice.ac.uk/catalogue/?sn=7725),  DOI = 10.5255/UKDA-SN-7725-1

Quarterly Labour Force Survey, January – March 2015: Special Licence Access (http://discover.ukdataservice.ac.uk/catalogue/?sn=7726), DOI = 10.5255/UKDA-SN-7726-1



The latter contains extra variables and hence is subject to more restrictive access conditions.  There are other examples of this in our catalogue, moving along the spectrum of access, into secure/controlled as well.



Hope that helps,



Lucy



___________________________________

Lucy Johnson

Functional Director, Data Access

___________________________________

T +44(0) 1206 872008

E [log in to unmask]<mailto:[log in to unmask]>

W ukdataservice.ac.uk

___________________________________

UK Data Service

UK Data Archive

University of Essex

___________________________________

Legal Disclaimer: Any views expressed by the sender of this message

are not necessarily those of the UK Data Service or the UK Data Archive.

This email and any files with it are confidential and intended solely for

the use of the individual(s) or entity to whom they are addressed.







From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Nicola Dawson

Sent: 12 August 2015 16:21

To: [log in to unmask]<mailto:[log in to unmask]>

Subject: Re: Anonymised and non-anonymised datasets



Thanks for the responses – Kate, I particularly liked the way you’ve set out your dataset information, it’s really clear and easy to use.



Does anyone out there have any thoughts or experience in creating more than one DOI for a dataset just in case this might be a better way forward (although I currently think option B is the way to go!)



Regards

Nicola



From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Katherine McNeill

Sent: 11 August 2015 18:26

To: [log in to unmask]<mailto:[log in to unmask]>

Subject: Re: Anonymised and non-anonymised datasets



Nicola,



I can share an example of model B in action for you.  It might be the same with other repositories, but model B that you described is the one used by the ICPSR social science data archive (essentially the UK Data Service of the U.S.).  For those studies that have restricted sets of data, there’s a note to that effect and instructions for requesting access.  E.g., this study http://doi.org/10.3886/ICPSR34314.v3 has a note near the top entitled Access Notes.



Sincerely,

Kate McNeill

___________________________________

Katherine McNeill<http://libguides.mit.edu/profiles/mcneillh>

Program Head, Data Management Services

Massachusetts Institute of Technology

[log in to unmask]<mailto:[log in to unmask]> | 617-253-0787

Data Management Services<http://libraries.mit.edu/data-management>



From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Andrew MacLellan

Sent: Tuesday, August 11, 2015 12:09 PM

To: [log in to unmask]<mailto:[log in to unmask]>

Subject: Re: Anonymised and non-anonymised datasets



Hi Nicola,



Assuming the participants had given consent for personal data to be shared under non-disclosure agreements only, and that there is some kind of significant value to the personal data, I would go with option B. It depends a bit on the dataset, but I think typically, an anonymised dataset is sufficient for most purposes.



If this is a situation where there is clear value in being able to identify the participants or other people discussed in the interviews, and it’s likely that there will be requests to access the personal data, then I suppose it might make sense to go for option A. I’m not a DOI expert though so perhaps someone else on the list would have something to say about creating separate DOI’s for such similar datasets.



I don’t fully understand option C so won’t comment on that.



Hope that helps,

Andrew



Andrew Maclellan

Research Data Support Officer | Research Data Management and Sharing

Research and Knowledge Exchange Services

University of Strathclyde, Graham Hills Building, 50 George Street, Glasgow, G1 1QE

Tel: 0141 548 4581

Email: [log in to unmask]<mailto:[log in to unmask]>



From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Nicola Dawson

Sent: 11 August 2015 16:26

To: [log in to unmask]<mailto:[log in to unmask]>

Subject: Anonymised and non-anonymised datasets



Dear All

We have just been speaking to a researcher who wants to publish a dataset which has a number of different file-types including some interview transcripts.  He has two versions of the dataset -



1                     contains personal data within the interview transcripts – this version of the dataset could be shared subject to a contractual non-disclosure agreement



2                     contains all the same data, but the interview transcripts have been anonymised –this version of the dataset could be shared under a creative commons licence



We are currently considering the following options:



a)      Create two versions of the “dataset description” with two separate DOIs – one with open access, the other requiring contractual terms to be discussed to allow release



b)      Make public only the version of the dataset with the anonymised data, with a note in the description that external researchers should contact the University separately to request access to the version containing personal data and deal with it manually



c)       Come up with some kind of technical solution/change to our system to allow us to give two options to the requestor (and try to find some clever technical way of linking to the different files) however this might be quite a lot of work for something that might not happen regularly



I wondered whether anyone else had come across this issue and had a good solution for how to manage it?



Many thanks

Nicola



Nicola Dawson

Business Change Manager

Research Data and Information Management

University IT Services

Cardiff University

39 Park Place

Cardiff

CF10 3BB

Tel: +44(0)29 2087 5891

Email: [log in to unmask]<mailto:[log in to unmask]>



Nicola Dawson

Rheolwr Newid Busnes

Rheoli Data a Gwybodaeth Ymchwil

Gwasanaethau TG y Brifysgol

Prifysgol Caerdydd

39 Plas y Parc

Caerdydd

CF10 3BB

Ffôn : +44(0)29 2087 5891

Ebost: [log in to unmask]<mailto:[log in to unmask]>











******************************************************************************************************************

Experience the British Library online at www.bl.uk<http://www.bl.uk/>

The British Library’s latest Annual Report and Accounts : www.bl.uk/aboutus/annrep/index.html<http://www.bl.uk/aboutus/annrep/index.html>

Help the British Library conserve the world's knowledge. Adopt a Book. www.bl.uk/adoptabook<http://www.bl.uk/adoptabook>

The Library's St Pancras site is WiFi - enabled

*****************************************************************************************************************

The information contained in this e-mail is confidential and may be legally privileged. It is intended for the addressee(s) only. If you are not the intended recipient, please delete this e-mail and notify the [log in to unmask]<mailto:[log in to unmask]> : The contents of this e-mail must not be disclosed or copied without the sender's consent.

The statements and opinions expressed in this message are those of the author and do not necessarily reflect those of the British Library. The British Library does not take any responsibility for the views of the author.

*****************************************************************************************************************

Think before you print



------------------------------



Date:    Thu, 13 Aug 2015 15:08:04 +0000

From:    "Johnson, Lucy A" <[log in to unmask]>

Subject: Re: Anonymised and non-anonymised datasets



Hi Andrew



I suppose you could argue that perhaps the two datasets should be counted twice, if one has had more value added to it (rather than one having had variables removed)?  But here at the UK Data Service, we do link these sorts of datasets together via their metadata so that any connections are transparent.  The metadata for one will reference the other.



A related initiative that you might find interesting is IRUS.  Have you seen this?  There is more information here: http://www.irus.mimas.ac.uk/.  It is a metrics service which counts downloaded content, including datasets, from participating UK institutional repositories.  The caveat is that these two datasets would probably still be counted separately here!



Best



Lucy



From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Andrew MacLellan

Sent: 13 August 2015 15:54

To: [log in to unmask]

Subject: Re: Anonymised and non-anonymised datasets



Thanks to Rachael and Lucy, that’s helpful for me. It makes sense that the ability to cite data unambiguously should be prioritised.



One small follow on query though: would it be problematic if this method of creating separate datasets with separate DOI’s was routinely carried out by a researcher, and then that researcher would appear to have deposited twice as many distinct datasets as they actually have? I can imagine this causing headaches for Universities trying to measure and reward data sharing. Is there an easy work-around for this?



Thanks,

Andrew



Andrew Maclellan

Research Data Support Officer | Research Data Management and Sharing

Research and Knowledge Exchange Services

University of Strathclyde, Graham Hills Building, 50 George Street, Glasgow, G1 1QE

Tel: 0141 548 4581

Email: [log in to unmask]<mailto:[log in to unmask]>





From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of datasets

Sent: 12 August 2015 17:18

To: [log in to unmask]<mailto:[log in to unmask]>

Subject: Re: Anonymised and non-anonymised datasets



Nicola,



I would also recommend the UK Data Service approach here. There is no problem with having two datasets that are separately cite-able with separate DOIs even if there is a large amount of overlap – the small area without overlap can create a large difference in analysis of the two sets of data.



But if this wasn’t technically possible in your system, and you were only able to assign one DOI for some reason, I think that the DOI and so the metadata you provide would ideally describe the full dataset that also includes the sensitive data. I say that because I would see the more freely available anonymised data as a sub-set of the full dataset - the full dataset being the available data plus the identifying information. It would then be for citing authors to highlight the subset of the data they actually used (whether they would or not in reality is the reason having two DOIs would be a better approach). It would be trickier for a citing author who used the wider set if it was the other way around.



I caveat the last para stating that those are my views, not official DataCite guidance!



Thanks, Rachael.





Rachael Kotarski

Data Services and Content Lead

The British Library, 96 Euston Road, London NW1 2DB



Tel: 020 7412 7167 | Email: [log in to unmask]<mailto:[log in to unmask]>



|  Datasets@BL<http://www.bl.uk/datasets>  |  DataCite<http://www.datacite.org/>  |  Twitter<http://twitter.com/DataCiteUK>  |







From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Johnson, Lucy A

Sent: 12 August 2015 16:57

To: [log in to unmask]<mailto:[log in to unmask]>

Subject: Re: Anonymised and non-anonymised datasets



Hi Nicola



Hope I can help with this one.



Here in the UK Data Service we do just that – have two DOIs if the dataset has been changed in some way.  Our thinking is that dataset a which contains the open access content is different to dataset b which contains additional, sensitive material.  If a researcher wanted to trace back the data that had been cited in a paper somewhere, they want to know which of these two datasets they came from.  Hence the need for two DOIs.



Here is an example of this in action:



Quarterly Labour Force Survey, January – March 2015 (http://discover.ukdataservice.ac.uk/catalogue/?sn=7725),  DOI = 10.5255/UKDA-SN-7725-1

Quarterly Labour Force Survey, January – March 2015: Special Licence Access (http://discover.ukdataservice.ac.uk/catalogue/?sn=7726), DOI = 10.5255/UKDA-SN-7726-1



The latter contains extra variables and hence is subject to more restrictive access conditions.  There are other examples of this in our catalogue, moving along the spectrum of access, into secure/controlled as well.



Hope that helps,



Lucy



___________________________________

Lucy Johnson

Functional Director, Data Access

___________________________________

T +44(0) 1206 872008

E [log in to unmask]<mailto:[log in to unmask]>

W ukdataservice.ac.uk

___________________________________

UK Data Service

UK Data Archive

University of Essex

___________________________________

Legal Disclaimer: Any views expressed by the sender of this message

are not necessarily those of the UK Data Service or the UK Data Archive.

This email and any files with it are confidential and intended solely for

the use of the individual(s) or entity to whom they are addressed.







From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Nicola Dawson

Sent: 12 August 2015 16:21

To: [log in to unmask]<mailto:[log in to unmask]>

Subject: Re: Anonymised and non-anonymised datasets



Thanks for the responses – Kate, I particularly liked the way you’ve set out your dataset information, it’s really clear and easy to use.



Does anyone out there have any thoughts or experience in creating more than one DOI for a dataset just in case this might be a better way forward (although I currently think option B is the way to go!)



Regards

Nicola



From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Katherine McNeill

Sent: 11 August 2015 18:26

To: [log in to unmask]<mailto:[log in to unmask]>

Subject: Re: Anonymised and non-anonymised datasets



Nicola,



I can share an example of model B in action for you.  It might be the same with other repositories, but model B that you described is the one used by the ICPSR social science data archive (essentially the UK Data Service of the U.S.).  For those studies that have restricted sets of data, there’s a note to that effect and instructions for requesting access.  E.g., this study http://doi.org/10.3886/ICPSR34314.v3 has a note near the top entitled Access Notes.



Sincerely,

Kate McNeill

___________________________________

Katherine McNeill<http://libguides.mit.edu/profiles/mcneillh>

Program Head, Data Management Services

Massachusetts Institute of Technology

[log in to unmask]<mailto:[log in to unmask]> | 617-253-0787

Data Management Services<http://libraries.mit.edu/data-management>



From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Andrew MacLellan

Sent: Tuesday, August 11, 2015 12:09 PM

To: [log in to unmask]<mailto:[log in to unmask]>

Subject: Re: Anonymised and non-anonymised datasets



Hi Nicola,



Assuming the participants had given consent for personal data to be shared under non-disclosure agreements only, and that there is some kind of significant value to the personal data, I would go with option B. It depends a bit on the dataset, but I think typically, an anonymised dataset is sufficient for most purposes.



If this is a situation where there is clear value in being able to identify the participants or other people discussed in the interviews, and it’s likely that there will be requests to access the personal data, then I suppose it might make sense to go for option A. I’m not a DOI expert though so perhaps someone else on the list would have something to say about creating separate DOI’s for such similar datasets.



I don’t fully understand option C so won’t comment on that.



Hope that helps,

Andrew



Andrew Maclellan

Research Data Support Officer | Research Data Management and Sharing

Research and Knowledge Exchange Services

University of Strathclyde, Graham Hills Building, 50 George Street, Glasgow, G1 1QE

Tel: 0141 548 4581

Email: [log in to unmask]<mailto:[log in to unmask]>



From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Nicola Dawson

Sent: 11 August 2015 16:26

To: [log in to unmask]<mailto:[log in to unmask]>

Subject: Anonymised and non-anonymised datasets



Dear All

We have just been speaking to a researcher who wants to publish a dataset which has a number of different file-types including some interview transcripts.  He has two versions of the dataset -



1                     contains personal data within the interview transcripts – this version of the dataset could be shared subject to a contractual non-disclosure agreement



2                     contains all the same data, but the interview transcripts have been anonymised –this version of the dataset could be shared under a creative commons licence



We are currently considering the following options:



a)      Create two versions of the “dataset description” with two separate DOIs – one with open access, the other requiring contractual terms to be discussed to allow release



b)      Make public only the version of the dataset with the anonymised data, with a note in the description that external researchers should contact the University separately to request access to the version containing personal data and deal with it manually



c)       Come up with some kind of technical solution/change to our system to allow us to give two options to the requestor (and try to find some clever technical way of linking to the different files) however this might be quite a lot of work for something that might not happen regularly



I wondered whether anyone else had come across this issue and had a good solution for how to manage it?



Many thanks

Nicola



Nicola Dawson

Business Change Manager

Research Data and Information Management

University IT Services

Cardiff University

39 Park Place

Cardiff

CF10 3BB

Tel: +44(0)29 2087 5891

Email: [log in to unmask]<mailto:[log in to unmask]>



Nicola Dawson

Rheolwr Newid Busnes

Rheoli Data a Gwybodaeth Ymchwil

Gwasanaethau TG y Brifysgol

Prifysgol Caerdydd

39 Plas y Parc

Caerdydd

CF10 3BB

Ffôn : +44(0)29 2087 5891

Ebost: [log in to unmask]<mailto:[log in to unmask]>











******************************************************************************************************************

Experience the British Library online at www.bl.uk<http://www.bl.uk/>

The British Library’s latest Annual Report and Accounts : www.bl.uk/aboutus/annrep/index.html<http://www.bl.uk/aboutus/annrep/index.html>

Help the British Library conserve the world's knowledge. Adopt a Book. www.bl.uk/adoptabook<http://www.bl.uk/adoptabook>

The Library's St Pancras site is WiFi - enabled

*****************************************************************************************************************

The information contained in this e-mail is confidential and may be legally privileged. It is intended for the addressee(s) only. If you are not the intended recipient, please delete this e-mail and notify the [log in to unmask]<mailto:[log in to unmask]> : The contents of this e-mail must not be disclosed or copied without the sender's consent.

The statements and opinions expressed in this message are those of the author and do not necessarily reflect those of the British Library. The British Library does not take any responsibility for the views of the author.

*****************************************************************************************************************

Think before you print



------------------------------



Date:    Thu, 13 Aug 2015 15:26:26 +0000

From:    James Davenport <[log in to unmask]>

Subject: Re: Anonymised and non-anonymised datasets



Is this more of a problem than the researcher who publishes an "extended abstract" and then the journal paper, or the researcher who publishes the same paper in multiple languages?





James Davenport

National Teaching Fellow 2014

Hebron & Medlock Professor of Information Technology, University of Bath

OpenMath Content Dictionary Editor

Director of Studies EPSRC Doctoral Taught Course Centre for HPC

Chair, IMU Committee on Electronic Information and Communication

Vice-President and Academy Trustee, British Computer Society

________________________________

From: Research Data Management discussion list <[log in to unmask]> on behalf of Andrew MacLellan <[log in to unmask]>

Sent: Thursday, August 13, 2015 3:53 PM

To: [log in to unmask]

Subject: Re: Anonymised and non-anonymised datasets



Thanks to Rachael and Lucy, that’s helpful for me. It makes sense that the ability to cite data unambiguously should be prioritised.



One small follow on query though: would it be problematic if this method of creating separate datasets with separate DOI’s was routinely carried out by a researcher, and then that researcher would appear to have deposited twice as many distinct datasets as they actually have? I can imagine this causing headaches for Universities trying to measure and reward data sharing. Is there an easy work-around for this?



Thanks,

Andrew



Andrew Maclellan

Research Data Support Officer | Research Data Management and Sharing

Research and Knowledge Exchange Services

University of Strathclyde, Graham Hills Building, 50 George Street, Glasgow, G1 1QE

Tel: 0141 548 4581

Email: [log in to unmask]<mailto:[log in to unmask]>





From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of datasets

Sent: 12 August 2015 17:18

To: [log in to unmask]

Subject: Re: Anonymised and non-anonymised datasets



Nicola,



I would also recommend the UK Data Service approach here. There is no problem with having two datasets that are separately cite-able with separate DOIs even if there is a large amount of overlap – the small area without overlap can create a large difference in analysis of the two sets of data.



But if this wasn’t technically possible in your system, and you were only able to assign one DOI for some reason, I think that the DOI and so the metadata you provide would ideally describe the full dataset that also includes the sensitive data. I say that because I would see the more freely available anonymised data as a sub-set of the full dataset - the full dataset being the available data plus the identifying information. It would then be for citing authors to highlight the subset of the data they actually used (whether they would or not in reality is the reason having two DOIs would be a better approach). It would be trickier for a citing author who used the wider set if it was the other way around.



I caveat the last para stating that those are my views, not official DataCite guidance!



Thanks, Rachael.





Rachael Kotarski

Data Services and Content Lead

The British Library, 96 Euston Road, London NW1 2DB



Tel: 020 7412 7167 | Email: [log in to unmask]<mailto:[log in to unmask]>



|  Datasets@BL<http://www.bl.uk/datasets>  |  DataCite<http://www.datacite.org/>  |  Twitter<http://twitter.com/DataCiteUK>  |







From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Johnson, Lucy A

Sent: 12 August 2015 16:57

To: [log in to unmask]<mailto:[log in to unmask]>

Subject: Re: Anonymised and non-anonymised datasets



Hi Nicola



Hope I can help with this one.



Here in the UK Data Service we do just that – have two DOIs if the dataset has been changed in some way.  Our thinking is that dataset a which contains the open access content is different to dataset b which contains additional, sensitive material.  If a researcher wanted to trace back the data that had been cited in a paper somewhere, they want to know which of these two datasets they came from.  Hence the need for two DOIs.



Here is an example of this in action:



Quarterly Labour Force Survey, January – March 2015 (http://discover.ukdataservice.ac.uk/catalogue/?sn=7725),  DOI = 10.5255/UKDA-SN-7725-1

Quarterly Labour Force Survey, January – March 2015: Special Licence Access (http://discover.ukdataservice.ac.uk/catalogue/?sn=7726), DOI = 10.5255/UKDA-SN-7726-1



The latter contains extra variables and hence is subject to more restrictive access conditions.  There are other examples of this in our catalogue, moving along the spectrum of access, into secure/controlled as well.



Hope that helps,



Lucy



___________________________________

Lucy Johnson

Functional Director, Data Access

___________________________________

T +44(0) 1206 872008

E [log in to unmask]<mailto:[log in to unmask]>

W ukdataservice.ac.uk

___________________________________

UK Data Service

UK Data Archive

University of Essex

___________________________________

Legal Disclaimer: Any views expressed by the sender of this message

are not necessarily those of the UK Data Service or the UK Data Archive.

This email and any files with it are confidential and intended solely for

the use of the individual(s) or entity to whom they are addressed.







From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Nicola Dawson

Sent: 12 August 2015 16:21

To: [log in to unmask]<mailto:[log in to unmask]>

Subject: Re: Anonymised and non-anonymised datasets



Thanks for the responses – Kate, I particularly liked the way you’ve set out your dataset information, it’s really clear and easy to use.



Does anyone out there have any thoughts or experience in creating more than one DOI for a dataset just in case this might be a better way forward (although I currently think option B is the way to go!)



Regards

Nicola



From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Katherine McNeill

Sent: 11 August 2015 18:26

To: [log in to unmask]<mailto:[log in to unmask]>

Subject: Re: Anonymised and non-anonymised datasets



Nicola,



I can share an example of model B in action for you.  It might be the same with other repositories, but model B that you described is the one used by the ICPSR social science data archive (essentially the UK Data Service of the U.S.).  For those studies that have restricted sets of data, there’s a note to that effect and instructions for requesting access.  E.g., this study http://doi.org/10.3886/ICPSR34314.v3 has a note near the top entitled Access Notes.



Sincerely,

Kate McNeill

___________________________________

Katherine McNeill<http://libguides.mit.edu/profiles/mcneillh>

Program Head, Data Management Services

Massachusetts Institute of Technology

[log in to unmask]<mailto:[log in to unmask]> | 617-253-0787

Data Management Services<http://libraries.mit.edu/data-management>



From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Andrew MacLellan

Sent: Tuesday, August 11, 2015 12:09 PM

To: [log in to unmask]<mailto:[log in to unmask]>

Subject: Re: Anonymised and non-anonymised datasets



Hi Nicola,



Assuming the participants had given consent for personal data to be shared under non-disclosure agreements only, and that there is some kind of significant value to the personal data, I would go with option B. It depends a bit on the dataset, but I think typically, an anonymised dataset is sufficient for most purposes.



If this is a situation where there is clear value in being able to identify the participants or other people discussed in the interviews, and it’s likely that there will be requests to access the personal data, then I suppose it might make sense to go for option A. I’m not a DOI expert though so perhaps someone else on the list would have something to say about creating separate DOI’s for such similar datasets.



I don’t fully understand option C so won’t comment on that.



Hope that helps,

Andrew



Andrew Maclellan

Research Data Support Officer | Research Data Management and Sharing

Research and Knowledge Exchange Services

University of Strathclyde, Graham Hills Building, 50 George Street, Glasgow, G1 1QE

Tel: 0141 548 4581

Email: [log in to unmask]<mailto:[log in to unmask]>



From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Nicola Dawson

Sent: 11 August 2015 16:26

To: [log in to unmask]<mailto:[log in to unmask]>

Subject: Anonymised and non-anonymised datasets



Dear All

We have just been speaking to a researcher who wants to publish a dataset which has a number of different file-types including some interview transcripts.  He has two versions of the dataset -



1                     contains personal data within the interview transcripts – this version of the dataset could be shared subject to a contractual non-disclosure agreement



2                     contains all the same data, but the interview transcripts have been anonymised –this version of the dataset could be shared under a creative commons licence



We are currently considering the following options:



a)      Create two versions of the “dataset description” with two separate DOIs – one with open access, the other requiring contractual terms to be discussed to allow release



b)      Make public only the version of the dataset with the anonymised data, with a note in the description that external researchers should contact the University separately to request access to the version containing personal data and deal with it manually



c)       Come up with some kind of technical solution/change to our system to allow us to give two options to the requestor (and try to find some clever technical way of linking to the different files) however this might be quite a lot of work for something that might not happen regularly



I wondered whether anyone else had come across this issue and had a good solution for how to manage it?



Many thanks

Nicola



Nicola Dawson

Business Change Manager

Research Data and Information Management

University IT Services

Cardiff University

39 Park Place

Cardiff

CF10 3BB

Tel: +44(0)29 2087 5891

Email: [log in to unmask]<mailto:[log in to unmask]>



Nicola Dawson

Rheolwr Newid Busnes

Rheoli Data a Gwybodaeth Ymchwil

Gwasanaethau TG y Brifysgol

Prifysgol Caerdydd

39 Plas y Parc

Caerdydd

CF10 3BB

Ffôn : +44(0)29 2087 5891

Ebost: [log in to unmask]<mailto:[log in to unmask]>











******************************************************************************************************************

Experience the British Library online at www.bl.uk<http://www.bl.uk/>

The British Library’s latest Annual Report and Accounts : www.bl.uk/aboutus/annrep/index.html<http://www.bl.uk/aboutus/annrep/index.html>

Help the British Library conserve the world's knowledge. Adopt a Book. www.bl.uk/adoptabook<http://www.bl.uk/adoptabook>

The Library's St Pancras site is WiFi - enabled

*****************************************************************************************************************

The information contained in this e-mail is confidential and may be legally privileged. It is intended for the addressee(s) only. If you are not the intended recipient, please delete this e-mail and notify the [log in to unmask]<mailto:[log in to unmask]> : The contents of this e-mail must not be disclosed or copied without the sender's consent.

The statements and opinions expressed in this message are those of the author and do not necessarily reflect those of the British Library. The British Library does not take any responsibility for the views of the author.

*****************************************************************************************************************

Think before you print



------------------------------



End of RESEARCH-DATAMAN Digest - 12 Aug 2015 to 13 Aug 2015 (#2015-28)

**********************************************************************
Top of Message | Previous Page | Permalink
JiscMail Tools

Files Area | help
RSS Feeds and Sharing

Search Archives

Advanced Options