I am a novice in stats. I am trying to figure out the best way to design
an experiment.
In simplified terms, I have a few categories of things. Suppose I have
square shaped things and triangular shaped things. However, I don't want
to make any prior assumption that I know anything descriptive about my
categories. Rather, I want to ask experimental subjects to attempt to
describe my categories. Hence, the experiment is exploratory.
I am willing to limit descriptions to single words.
It occurs to me that I could use the binomial distribution. For example,
if 4 out of 10 participants describe a category as "triangular" then I
could characterize the probability. However, I have a doubt about this
procedure.
There are at least 30000 English words. So the probability of any
single word occurring more than once is very small for a sample size
less than 50. Is it sensible to report such results (20 in 50 with
binomial p=1/30000)? Or is this too obvious?
Since the experiment is exploratory, is it enough to report only the
frequency of each descriptive word? For example, N=10 triangular (4),
shapes (3), solid-color (2), etc.
But then how to decide a reasonable sample size?
Am I even looking at this in a sensible way?
Can anybody suggest relevant references?
|