Hi,
It's maybe not the solution to your specific stats problem, but I was
recently questioned by reviewers on the p-values which I used for making
inferences in a paper I'd submitted. So, alongside the p-values we
calculated effect sizes (we used Cohen's D - your examiner refers to
effect sizes and these are really simple when you get your head round them
- you can do these by hand!) and used these as a guide to where we were
seeing group differences. For one result where we had a non-significant
p-value that appeared to miss significance because of a larger SD in the
placebo group, we were still able to discuss the result in positive terms.
It depends on the nature of your studies but for us the size of the
differences was as important as the statistical significance.
The other issue that could add to the discussion is what are the relative
implications of making a Type I or Type II error - you could argue it's
worse to discard a potentially effective treatment for cancer by mistake,
than it is to continue to develop one that turns out to be no good.
It may be that reporting in your thesis that you've considered the issues
is enough to satisfy the examiner without having to redo all your
analyses.
Good luck,
Brian Saxby
The article we used to support our aproach (of not making family-wise
p-value adjustments) is available from:
http://www.biomedcentral.com/1471-2288/2/8
> Goodness me there is a lot to say here.
>
> First, congrats on getting to and getting through your viva. This
> ounds like a fairly minor problem to me.
>
> My first thought is that for at least some of that you might be able
> to make the examiner happy by discussing the limitations of your
> results - saying things like the p-values should be treated more
> descriptively than inferentially might help. In other words tone down
> your certainty of your resuts.
>
> In general I am pretty averse to bomferronii correction of p values -
> I have written about that in a book I wrote with phil banyard (using
> statistics in your psychology degree) if you look on amazon you might
> be able to search inside and find it.
>
> There are better alternatives to bonferroni available - things like
> hochberg's false discovery rate. Sas has a procedure (which I forget
> the name of) which will correct p values using that , or several other
> procedures. But they aren't hard to do by hand .
>
> There are several possible solutions to the transformation problem.
> One is to carry out a logistic transformation. If p is the proportion
> of correct responses then you could try transforming using
> log(p/(1-p)) - that sometimes works.
>
> The most appropriate type of regression is for proportions and is
> called beta regression. I don't think you can use that in spss, but
> it can be done in R (using betareg) sas (using proc genmod, I think)
> or stata ( I forget the function name).
>
> Another approach you might try is to bootstrap your parameter
> estimates. That is a little fiddly in spss - look at spsstools.net,
> it is easier in R or stata.
>
> Another way of thinking about it would be to examine whether it is a
> problem by using a monte carlo simulation. Here you generate lots of
> samples with the same distribution as your data but where the true
> differences are zero and see how often you get a significant result.
> If it is about 5% of the time then you haven't got a false positive
> problem. You might still have a power problem but power problems are
> a lot less serious and power analysis is a bit of a black art anyway.
>
> Another way might be to use a logistic approach with a sandwich
> estimator. When you do this you treat individuals as the unit of
> anaysis and do a logistic regression to find the probability of a
> correct answer. That is wrong because you have violated the
> independence assumption. But that can be fixed through the use of a
> sandwich estimator. Search google for huber-white spss and you'll
> find a blog entry of mine that will tell you how to do it. It's a bit
> of a hack in spss, really easy in stata, not hard in R and a bit weird
> in sas. A similar approach would be to use a binomial regression -
> this is used when you have trials and successes-e g I tried to do
> something 4 times and succeeded twice and you tried to do it 10 times
> and succeeded 5 times. There is a smaller standard error on your 50%
> than on my 50% and binomial regression takes that into account.
> (There are a lot of circumstances where the binomial and the logistic
> sandwich regression will give the same answer.)
>
> The first thing I would do is talk to your supervisor and maybe see if
> it is appropriate to for them to contact the examiner and ask what
> solutions might be appropriate. Something else to consider is that
> you might try a more complex analysis with one or two analyses and if
> you get substantivley similar answers then you can use that as an
> argument that the others are ok.
>
> Hope that helps. Remember that examiners are human too, and they
> aren't out to kill you, or even fail you - if they had wanted to do
> that they could have done it straight away. Apologies for any typos,
> I am on my cellphone.
>
> Jeremy
>
>
>
> On 7/4/08, Jo Fludder <[log in to unmask]> wrote:
>> Hi guys
>>
>> I have just had my viva, and phad major corrections. One which is going
>> to
>> cause sleepless nights concerns using ANOVAs on non-normally distributed
>> data.
>>
>> Throughout my thesis I have used many 2 x 2 ANOVAs, (plus posthocs on
>> interactions). However, some of my data is non-normally distributed.
>> Also,
>> in some expereimnets, I have unequal sample sizes in conditions (I tried
>> to
>> overcome this by using Games-Howell posthocs). At the time, I tried many
>> different transformations, but none of them were able to alter the data
>> sufficiently. However, I was advised by my supervisor to continue to use
>> ANOVAs anyway, and just put a disclaimer saying that I tried all these
>> transformations and none worked.
>>
>> My examiner quickly pointed out that if none of the transformations
>> worked,
>> then the data must be non-normal, and so I should not have gone ahead
>> with
>> using ANOVAs!
>>
>> Could anyone give me some advice on either a) justifying the use of
>> ANOVAs,
>> or b)an alternative that I could use? I have 9 expereiments, with many
>> analyses in each one, so any help to ease my suffering would be
>> apprechiated.
>>
>> For any of you who are interested in knowing what the examiner said (so
>> that you have a better idea of the problem, I include the comment below.
>>
>> many thanks in advance.
>>
>> Jo
>>
>> comment from examiner:
>>
>> b. Statistical analyses. First, the nature of the data needs to be more
>> carefully considered. By using the percentage of correct target
>> behaviours
>> as a dependent variable one assumes that each target trial is equally
>> likely to be responded to within and between persons. However, it is
>> more
>> likely that once a person has not acted upon three consecutive trials
>> that
>> she would also not respond to the next. The means and standard
>> deviations
>> in fact suggest that in most experiments and conditions a substantial
>> number of persons missed none or very few items and some missed many or
>> all. This means that data are not normally distributed and the ANOVA is
>> not
>> appropriate. It also means that the average of individuals within groups
>> is
>> not representative of any individual in the group, and that it does not
>> represent the degree to which each individual is likely to be
>> successful.
>> Thus, the differences in findings between experiments could simply be an
>> artefact caused by slightly varying numbers of individuals not
>> performing
>> the task. At the minimum this needs to be acknowledged and discussed,
>> and
>> more appropriate analyses considered. A second problem is that of
>> family-wise error. Although the Bonferroni correction is briefly
>> mentioned
>> in connection with the data of Experiment 2, there is in general no
>> correction for the large number of analyses conducted when testing for
>> significance. In the first studies there are probably many more tests
>> than
>> individuals examined. The power estimates provided are not informative,
>> and
>> effect sizes would be preferable. The problem here is that significant
>> findings could be due to random sampling effects (which is suggested by
>> the
>> non-replicability of findings across experiments), and non-significant
>> findings could in fact be based on significant differences, but these
>> cannot be detected due to lack of statistical power. Thus, one wants to
>> know how much power the sample size provides, i.e., effects of what size
>> can be reliably detected, before the analyses, and effect sizes say
>> something about the relative size of the effect given the spread of
>> data.
>> However, due to non-normal distribution of dependent variables there is
>> a
>> problem. Again, this needs careful discussion, and acknowledgement of
>> the
>> limitations of the study.
>>
>
>
> --
> Jeremy Miles
> Learning statistics blog: www.jeremymiles.co.uk/learningstats
> Psychology Research Methods Wiki: www.researchmethodsinpsychology.com
>
|