Print

Print


On Sun, Aug 15, 2010 at 7:37 PM, Craig A. Berry <[log in to unmask]> wrote:
> The longer the text or segment of text, the lower the density (bigger number
> on my chart).  The differences between texts of similar length is very small
> compared to the differences between texts of differing length.  This makes
> (common) sense.  The longer you talk, the higher the probability that each
> additional word you utter will be a word you've used before.  The lexically
> innovative powers of a Spenser or a Shakespeare might stave off this
> inevitability just a bit longer than the rest of us could do, but the
> constraints of memory and comprehensibility win out as a text grows in
> length.

I just sorted the Shakespeare data by length, and it shows the same
pattern: with one outlier, the shortest works (V&A, Luc., Tmp., Mac.)
have the highest lexical density.

Your explanation makes sense as well. I would have said that the
reader has more energy to concentrate on a short poem (and can also go
back to reread without breaking the momentum of a story). But the
"Poems and Sonnets" (which Hart's statistics group together) don't
bear that out. "Venus" and "Lucrece," with lengths of 1,194 lines and
1,855 lines, respectively, have densities of 1.75 and 1.52. But the
"Sonnets and Poems" (length 2,981) have merely an average density, of
1.05.

The outlier mentioned above is "Comedy of Errors." It's the
second-shortest work (2,037 lines), but has a lexical density of only
1.16; still above the mean, but way under "Venus" and "Lucrece." This
makes me think there might be a generic element after all since,
according to Hart, the comedies tend to have smaller vocabularies. But
then you would expect Tmp. to have a small vocabulary as well -- which
it doesn't, relative to its length (1.27).

-- 
Dr. David Wilson-Okamura    http://virgil.org          [log in to unmask]
English Department              Virgil reception, discussion, documents, &c
East Carolina University        Sparsa et neglecta coegi. -- Claude Fauchet