Print

Print


hello all-

my name is peter newman and i am currently a doctoral student in the
institute of communications reseach at the university of illinois.
i am working on my dissertation and wanted the advice of the group on
what software package might be "best" for the type of analysis that i wish
to engage in.

i have become very interested in automated text analysis over the past
six months. i have one year's worth of newsgroup data that i wish to
analyze.

based on my limited understanding of this type of analysis, i
believe i want to do the following:
1) take 12 months worth of newsgroup data (from a particular hobbyists
group) and divide it into one month blocks of text data
2) take each month of data and run it through at automatic indexing and
word co-occurence program. of course i need to remove words like "if,
that, and..."; plus build a thesauras for words that should be "lumped"
together and treated the same, for example: "metal halide" = "halide" =
"halides" = "MH", etc.
3) map word co occurences in multi dimensional space based on what
emerges from the text and on researcher defined criteria (including or
excluding certain terms)
4) compare the maps from month to month to see how usenet discussion
changes across time with new innovations in the hobby and new product
introductions.

it seems that several programs including CATPAC, Word Stat, MCCA, and
Code-A-Text  might let me do some or all of these items?
Does anyone have experience with these programs? Could you share your thoughts
on their suitability for my proposed project?

thanks in advance for your time,
sincerely,
peter newman





%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%