Another way, depending on the numbers involved would be to award them the mean score of that group? A way to check reliability would be then to take those participants out  and then run it again. If there is a huge difference you are stuck! I would run with a multilevel model.

Debbie

From: Research of postgraduate psychologists. <[log in to unmask]> on behalf of Jeremy Miles <[log in to unmask]>
Sent: 24 October 2013 19:24
To: [log in to unmask]
Subject: Re: Unequal number of observations across experimental groups
 

On 24 October 2013 08:18, Cat Davies <[log in to unmask]> wrote:
Some data I’m working on contains unequal numbers of observations per participant. The data come from an open-ended writing task and we want to compare the number of times which participants across 4 groups use different types of articles (a, the, etc). The writing samples are of differing lengths and so contain different numbers of article use.

What would be the best way of coming up with a comparable score for each type of article per participant and later per group? We could calculate percentages of say ‘the’ use from the total number of articles produced, but that feels unsatisfactory as the percentage score would be more accurate for those participants who produced longer writing samples.


Yes, you're right, it would affect your reliability.  Most people don't realize that.

 
Then I suppose this would have implications for the statistical test employed.



I might  need to understand your data better, but here are a few thoughts.

If you can estimate the reliability, you can use some form of weighted least squares regression, where you have a variable that rates the "importance" of different rows in the dataset. The more important rows are weighted higher (counted more) than the less important rows.  

Another possibility is to use an offset in a Poisson regression. If your variables are counts, Poisson regression is often appropriate.   An offset variable adds in a predictor variable, but fixes the parameter to be 1.  So it might say "Cat hit the target 4 times, and Jeremy hit the target 5 times, but Cat had 5 shots and Jeremy had 20".  It takes into account the 5 and 20.

Third (if I've understood correctly) you could use a multilevel model.  A regular multilevel model is used when you have (say) kids in classrooms, and you want to know how a characteristic of the teacher is predictive of the outcomes at the kid level. You've got different numbers of kids in each classroom though. Same deal here, but instead of kids in classrooms, you've got tasks in people.   

(It might be that you need a combination of Poisson regression with an offset AND a multilevel model, in which case you might consider (a) crying, or (b) finding someone knowledgeable to help you out. )

Also, in resonse to Takao's later comment, RM ANOVA is equivalent to a multilevel model, when you have the same measurements from everyone, but when one person misses on e measure, RM ANOVA will throw them out.  A multilvel model won't.  

Jeremy