Thank you to everyone who responded with their suggestions for tools we could use for text mining. Quite a few people asked me to send round the responses I got so please see below for those (I have anonymised them). We decided to use VOSViewer in the end. It is a free piece of software which is really simple to use and it allowed us to import csv files from Scopus and Web of Science into VOSViewer. We were then able to export an excel list of the most frequently used keywords from the articles. This list gave information about how many times they had appeared and how relevant they were. I would definitely recommend it for doing simple text mining of articles.

Thanks again to all who responded.

Sally

Responses:

Have you tried contacting Jon Brassey at TRIP, as he is doing work around this topic, and he is very approachable: [log in to unmask]

I like Termine (http://www.nactem.ac.uk/software/termine/#form) – very simple to use. I’ve only used it with batches of citations/abstracts though, and it does have a 2MB size limit on file sizes so I don’t know if it would be suitable for mining a lot of articles.

You can also make term lists on any other fields in Endnote (e.g. title or abstract) and then do frequency analysis of those fields/term lists - if that helps. The package I usually use for word frequency analysis other than endnote is Simstat and its plugin Wordstat.

We don’t do it ourselves but I attended an Evidence Synthesis Network Event where Prof. Sophia Ananiadou from the National Centre for Text Mining delivered a presentation, see

http://www.nactem.ac.uk/people.php

I’ve used the VOSViewer software (http://www.vosviewer.com/) created by CWTS at Leiden (which works with either Web of Science or Scopus data, and could also be adapted to work with output from other databases, I think) for this kind of thing. Its main purpose is to create visualisations, but it also allows the export of terms to a text file at certain points in the process.

Clinithink, which does NLP and text mining of clinical documents. One of the things they do is take documents and show you what terms in them are SNOMED terms. So it won’t show you common words like ’the’ ’it’ etc but it will show you all the concepts in the documents which are in SNOMED with cross maps to ICD etc. You can sign up for free here http://clinithink.com/clix-plus/

Sally Dalton BA Hons, MSc

Faculty Team Librarian/Research Support Officer (LUCID)

Health Sciences Library

University of Leeds

Level 7, Worsley Building, LS2 9JT

0113 34 36974

[log in to unmask]

Twitter: @lulhealthteam

From: Sally Dalton
Sent: 30 April 2014 10:32
To: [log in to unmask]
Subject: tools for text mining

Hello,

We are currently working on a bibliometrics project where we need to find out the common words and phrases used in a set of articles. We are hoping to do this using a text mining tool. Does anyone have any experience of text mining a set of articles and if so what software did you use to do it?

I know you can use Endnote to find out the subject headings used in a set of articles (using term lists) but we are also interested in the words and phrases used in the title and abstract.

Any advice would be greatly appreciated.

Thanks