Hi Mary, all,
We regularly record details of SI files in the LSHTM data repository. This was motivated by a desire to showcase and maintain an institutional record of researchers' data outputs, including that held & published elsewhere. At first I simply catalogued these resources and directed people to the 3rd party website. However, to address researchers' criticism that many of our metadata records were empty I've started to add CC-licensed content where possible.
This is quite labour-intensive at the moment. I review each new publication in our repository for supplementary files and make a decision on whether it should be catalogued. This isn't particularly systematic, but covers factors such as:
1. Content type: Is it a survey, processing script, dataset, software, or other output?
2. Size/extent: Is there a substantial amount of data? There needs to be some cut-off limit for content. I'm not convinced we need to have a separate record for a summary table with less than 10 rows, for instance.
3. File type: Is it held in a reusable format (XLS, SPSS, CSV)? PDFs are catalogued, but only if they contain substantial data tables or other data
There are a few questions that I've been struggling with, however:
1. How should we catalogue these files? I'd prefer to describe the SI files as a distinct entity, but it takes a long time to review the paper and data & authors are often uncommunicative. Is it sufficient to reproduce the publication abstract or use a blanket "supplementary info for XX" statement?
2. Should we be applying preservation action or enhancing these files?
3. Should we assign a DOI to these files? I've used the publication DOI in most cases, but is this the best approach?
4. Can we assume that the SI licence is the same as the publication?
More generally, it's be nice to automate the process of identifying and importing SI files relevant to publications.
Gareth
--
Gareth Knight
Research Data Manager,
Library & Archives Service
London School of Hygiene & Tropical Medicine
Keppel Street,
London WC1E 7HT
UK
(+44) 020 7927 2564
[log in to unmask]
http://www.lshtm.ac.uk/research/researchdataman/
-----Original Message-----
From: Research Data Management discussion list [mailto:[log in to unmask]] On Behalf Of Rzepa, Henry S
Sent: 18 April 2016 13:19
To: [log in to unmask]
Subject: Re: How do you handle supplementary info?
Yes, it’s a complex area. As chemists, around 1994 we set out on a project to define about 50 “media types” as part of what we called a chemical MIME content type. Quite a few of our choices still are in use but things have got far more complex since then. It might be worth taking a complete look at all the currently ratified MIME types for some help http://www.sitepoint.com/web-foundations/mime-types-complete-list/
Dave Martinsen has reviewed more recently; D. P. Martinsen, Supplemental Journal Article Materials in ACS Symposium Series, Special Issues in Data Management, 2012, Chapter 3, pp 31-45, DOI: http://doi.org7r9 and that might contain some more recent pointers in the physical sciences area.
On 18/04/2016, 13:01, "Research Data Management discussion list on behalf of Mary Donaldson" <[log in to unmask] on behalf of [log in to unmask]> wrote:
>Hello,
>
>At Glasgow, we're staring to look at how we handle data that is included in supplementary information files. We're becoming increasingly aware of the broad range of file types that are being included in SI, beyond the usual PDFs and extra figures. Many of these file types contain representations of data rather than the data themselves, but some could be data.
>
>We're planning on having a discussion soon to develop some internal guidelines for when the SI files should go in our publications repository and when they merit a record in the data repository. Has anyone else already visited this territory? If so, we'd love to know what conclusions you came to. We will also be happy to share our ideas once we've given them some thought and testing.
>
>Best wishes,
>Mary
>
>RDM Service Coordinator,
>University of Glasgow.
|