Peter,
I am aware of several projects which are using Code-A-Text for this form of
analysis and on the surface it would appear that the programme would be
appropriate. Be aware that you can generate huge amounts of data if you
pursue them in an unfocussed fashion.
>From the way you describe the project you would create a "text" from the
postings for each month in which each posting would constitute a "speech
unit". Code-A-Text will add the required designators which convert the text
into the dialogue format it uses internally.
It would certainly provided for you basic statistics (total number of
words, total number of unique words, in your monthly texts. You can group
words together; so for analytic purposes every occurance of a word in the
group is treated as a single word. You can then get frequency counts for
word groups within the text. You could also code each posting yourself.
It would be important to decide upon your "unit of analysis". Are you
concerend with the text as a whole or the individual postings within the
text. Code-A-Text Can manipulate texts to provide different units of
analysis, for instance speech units (postings), segments (paragraphs within
postings) or sentences. When calculating co-occurances the unit of
analysis is critical. Obviously, if you work with sentences there will be
lower rates of co-occurance than if you work with postings.
>From your point of view the main limitation of Code-A-Text is that it will
not generate a complete matrix of word co-occurances. You have to
specifify the word, word group or other forms of code in which you are
interested and then the programme will calculate the co-occurance with all
other words, word groups or codes in the text.
A second limitation is that these analyses only work with one text at a
time. However you can merge each of the monthly texts to form a "super
text" and analyse that.
Hope this helps
Regards
Alan Cartwright
At 07:51 09/11/98 -0600, you wrote:
>hello all-
>
>my name is peter newman and i am currently a doctoral student in the
>institute of communications reseach at the university of illinois.
>i am working on my dissertation and wanted the advice of the group on
>what software package might be "best" for the type of analysis that i wish
>to engage in.
>
>i have become very interested in automated text analysis over the past
>six months. i have one year's worth of newsgroup data that i wish to
>analyze.
>
>based on my limited understanding of this type of analysis, i
>believe i want to do the following:
>1) take 12 months worth of newsgroup data (from a particular hobbyists
>group) and divide it into one month blocks of text data
>2) take each month of data and run it through at automatic indexing and
>word co-occurence program. of course i need to remove words like "if,
>that, and..."; plus build a thesauras for words that should be "lumped"
>together and treated the same, for example: "metal halide" = "halide" =
>"halides" = "MH", etc.
>3) map word co occurences in multi dimensional space based on what
>emerges from the text and on researcher defined criteria (including or
>excluding certain terms)
>4) compare the maps from month to month to see how usenet discussion
>changes across time with new innovations in the hobby and new product
>introductions.
>
>it seems that several programs including CATPAC, Word Stat, MCCA, and
>Code-A-Text might let me do some or all of these items?
>Does anyone have experience with these programs? Could you share your
thoughts
>on their suitability for my proposed project?
>
>thanks in advance for your time,
>sincerely,
>peter newman
>
Alan Cartwright PhD
Code-A-Text Developer
Email [log in to unmask]
web page http://www.codeatext.u-net.com
Also
Senior Lecturer In Psychotherapy
Kent Institute of Medicine and Health Sciences.
University of Kent. UK.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|