C-I-SAID version 3 which is now available offers a variety of methods for
generating these types of word counts and also concordances which indicate
the frequency of the co-occurrence of words in a document. The documents
would have to be input as RTF or pasted via the clipboard.
It generates output as an interactive grid in which clicking on a cell
brings up the text (and the position) of the paragraphs containing the
word or the combination of words in the cell.
You can generate exclusion lists which remove words from the grid. You can
also search documents for those paragraphs containing given words and then
generate grids which describe the subset of documents.
C-I-SAID version 3 is not free but it is currently on special offer - see
details on my web site.
Regards
Alan Cartwright
At 12:19 08/08/01 +0100, you wrote:
>Randall's method works for a small file, but not well for a large document
>because of the size of the concordance created (even sorting alphabetically
>does not improve things noticably). I did it with a one page doc pretty
>quickly, but a 22 page doc had produced nothing after 5 minutes so I
>aborted it.
>
>One other problem with this is punctuation and capitalisation (I was using
>a Gutenberg version of "Sons and Lovers" to test it so this may not be a
>problem with all texts) - the following is an extract of what was produced:
>
>block 1, 3
>block. 3
>blocks 1, 3
>
>the 1, 2, 3
>The 1, 2, 3
>THE 1
>then 2, 3
>then, 3
>Then, 1
>there 1, 2
>There 1
>there, 1, 2
>
>You could of course use this method as the first stage in creating a
>comprehensive concordance file - you'd have everything included and could
>then choose how you wanted to index each item before auto marking - and
>standardise.
>
>Automarking is case sensitive so you need to be careful that you don't have
>a false sense of completeness, if you remove variants in respect of case.
>
>Duncan
>
>===================================================
>Duncan Branley [log in to unmask]
> Applications Officer, Information Services
> Goldsmiths' College, University of London
> New Cross, LONDON SE14 6NW
>Tel: +44 (0)20 7919 7708 Fax: +44 (0)20 7919 7556
>===================================================
Alan Cartwright PhD
Consultant in Interpersonal Research and Training.
Developer Code-A-Text MultiMedia Products
Hon. Senior Lecturer Kent Institute of Medicine and Health Studies.
Email [log in to unmask]
C-I-SAID: Powerful Multi-Media Software for Analysing Interviews and Dialogues.
CTANKS: Word processing, Recording, Transcription, Searching and Report
Generation in a single user friendly package.
Information at
Code-A-Text Web Page <http://www.code-a-text.co.uk>
|