Randall's method works for a small file, but not well for a large document
because of the size of the concordance created (even sorting alphabetically
does not improve things noticably). I did it with a one page doc pretty
quickly, but a 22 page doc had produced nothing after 5 minutes so I
aborted it.
One other problem with this is punctuation and capitalisation (I was using
a Gutenberg version of "Sons and Lovers" to test it so this may not be a
problem with all texts) - the following is an extract of what was produced:
block 1, 3
block. 3
blocks 1, 3
the 1, 2, 3
The 1, 2, 3
THE 1
then 2, 3
then, 3
Then, 1
there 1, 2
There 1
there, 1, 2
You could of course use this method as the first stage in creating a
comprehensive concordance file - you'd have everything included and could
then choose how you wanted to index each item before auto marking - and
standardise.
Automarking is case sensitive so you need to be careful that you don't have
a false sense of completeness, if you remove variants in respect of case.
Duncan
===================================================
Duncan Branley [log in to unmask]
Applications Officer, Information Services
Goldsmiths' College, University of London
New Cross, LONDON SE14 6NW
Tel: +44 (0)20 7919 7708 Fax: +44 (0)20 7919 7556
===================================================
|