Hi,
I’m not sure about the answer to your question, but just to point out that a lot of content can be obtained via Lexis Nexis:
https://www.lexisnexis.com/uk/legal/search/flap.do?flapID=newsandbusiness&random=0.19486340615746278
Also The Guardian content is all openly available and they have a great API for getting it. Unfortunately though with The Guardian,
you don’t get to see exactly how the news was laid out in the print editions – for that the workflow I know about is it go to the British Library.
HTH
Andy
http://www.geog.leeds.ac.uk/people/a.turner/index.html
From: Research Data Management discussion list [mailto:[log in to unmask]]
On Behalf Of Jenny O'Neill
Sent: 16 March 2018 14:39
To: [log in to unmask]
Subject: Sharing of newspaper content as research data
Hi all,
I have a query I'm hoping you can help with. Any advice or suggestions would be welcome.
I have a query relating to the sharing of newspaper content as research data. The researcher has used newspaper articles from 5 different
news sites on which to carry out their analysis. They would now like to share their data to support reproducability and follow-on research. However, as they don't own the copyright of the original articles we're wondering how/if they can share the data.
The data include the full text of the articles, the articles with words stemmed (e.g. charg = charger, charging etc.) together with some
metadata, including URL, derived keywords and phrases, headline, source, date etc. Is it possible to share a portion of the data, for example the metadata and derived keywords, can the headlines be shared, is there a certain percentage of the text that can
be shared without breaching copyright?
Some of the solutions we discussed include only releasing the URLs for the articles, but this causes issues as the URLs may not be stable
or the articles may have been updated in the meantime. If a database such as Proquest were used then only researchers with institutional access to these databases could access the data. The researcher could of course contact the news organisations to request
permission to share but if even one says no then the resulting dataset is not a complete dataset. And this is also not necessarily scalable if the research were broadened to 100 news organisations.
Have a lovely weekend and Happy St. Patrick's Day to any you who will be celebrating.
Kind regards,
Jenny
--
Jenny O'Neill
Data Manager, Research Services
UCD Library
Level 2, James Joyce Library
t: +353 (01) 716 7857 | e: [log in to unmask]