Very nice demonstration Jeremy - thanks. There are times when I have felt forced to use nonparametric tests due to having 'unfixable' distributions, when I would have much preferred to use parametric tests. However, your demonstration makes me think that perhaps there is a valid argument for having less strict criteria for the use of parametric tests e.g. using a Kolmogorov-Smirnov threshold of p<.01 instead of .05. Would you agree? Or would taking out the Kolmogorov and leaving just the Smirnov lead to statistical disinhibition, anarchy, and the downfall of SPSS as we know it?
________________________________________
From: Jeremy Miles [[log in to unmask]]
Sent: 13 February 2013 19:07
To: [log in to unmask]
Subject: Re: Continuous to Categoorical
On 13 February 2013 02:31, James Alvarez <[log in to unmask]<mailto:[log in to unmask]>> wrote:
Nice demonstration! I mean what I say strictly generally though - reducing information is super common, e.g. taking medians from reaction time distributions, filtering EEG signal, taking yearly over monthly averages.
Yep, and those should be avoided where possible, although it's not always possible. Also, taking the total score from a questionnaire - why not use the items? It will be more powerful if you do. (One reason not to use the items is that the model won't run, because when you use the items, you test a bunch of assumptions, and those assumptions are almost always wrong. If you use the totals, you ignore the items.)
Even the t-test can be seen as reducing information, it takes a bunch of numbers and spits out a single p value.
I don't think that's the same thing, because you're testing a hypothesis, not summarizing data.
I
would also wonder if your demonstration would stand for highly asymmetric/ bimodal data - such as what you might expect from time to time in likert scales, which is lumped in as continuous data. Also whilst as you show, it may loose power, generally when people want to do it then there's generally a reasoning which can't be captured adequately by a simple and clear analytic process such as your demonstration (e.g. a priori knowledge of effect size and the locations of an effect),
The cool thing about questions like this is that it's an empirical question. And we can answer it. (Another cool thing is that we might learn something along the way. Another cool thing is that we can do this so it looks like the data we have, and then we can find out.)
Let's generate some highly skewed data and find out.
I'll generate data using a binomial distribution with n=10, p=0.1 (or 0.15)
Here's my table of data the first time I generated it:
0 1 2 3 4 5
448 372 140 31 7 2
Pretty skewed, I think we'll agree.
Let's analyze it, 1000 times, and see what happens.
I'm going to analyze it 4 ways:
Poisson regression: This is the best for this type of data, as it's a count.
Linear regression (=t test)
Ordinal logistic regression - also pretty good, given that it's ordinal and skewed.
Logistic regression - I'm going to make it into zero / not zero and categorize it. (I'll put the R code at the bottom).
I'm going to have samples of 1000. This may not generalize to other sample sizes so if you want to know how it relates to your data, you should run it with the sample sizes you have.
Here are my results:
Each of these is the power - the probability of getting a significant
> mean(p.Reg < 0.05)
[1] 0.921
> mean(p.Poisson < 0.05)
[1] 0.914
> mean(p.Logistic < 0.05)
[1] 0.797
> mean(p.Polr < 0.05)
[1] 0.908
So regular regression / t-test was the best (that surprises me), Poisson and ordinal logistic were close, and logistic was the worst.
But this isn't the only issue. A second issue is type I error - how often do we get a type I error, i.e. a false positive.
Let's rerun the analysis but setting the null hypothesis to be true, so we DON'T want significant results.
We expect these values to all be 0.05
> mean(p.Reg < 0.05)
[1] 0.052
> mean(p.Poisson < 0.05)
[1] 0.038
> mean(p.Logistic < 0.05)
[1] 0.056
> mean(p.Polr < 0.05)
[1] 0.051
They're all pretty close - probably so close we don't need to worry.
But what if our sample was smaller (and the null hypothesis were true). I'll set the total sample size to 50 (25 per group) and skew the data more:
Here's one of the distributions.
> table(x$y)
0 1 2 4
35 12 2 1
That's pretty horribly skewed. How do we do?
> mean(p.Reg < 0.05)
[1] 0.06
> mean(p.Poisson < 0.05)
[1] 0.044
> mean(p.Logistic < 0.05)
[1] 0.06
> mean(p.Polr < 0.05)
[1] 0.054
regression/t-test is a touch high, but only a touch. Logistic is also a touch high, but only a touch. In terms of type I errors, it's not really a problem.
Lesson: We worry too much about normal distributions.
and so I am still saying that stating it's always a bad idea to reduce information is too Procrustean for my liking!
I think we agree on that. Just to prove it, here's a paper where I dichotomized (and had to argue with reviewers about it).
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3353018/pdf/nihms346858.pdf
Jeremy
Here's the R code that I used:
require(MASS)
require(Hmisc)
require(ordinal)
x <- as.data.frame(c(rep(0, 25), c(rep(1, 25))))
names(x) <- 'x'
p.Poisson <- array(rep(NA, 1000))
p.Reg <- array(rep(NA, 1000))
p.Logistic <- array(rep(NA, 1000))
p.Polr <- array(rep(NA, 1000))
x <- as.data.frame(c(rep(0, 25), c(rep(1, 25))))
names(x) <- 'x'
for(loop in c(1:1000)){
y0 <- rbinom(n=25, size = 20, prob=0.02)
y1 <- rbinom(n=25, size = 20, prob=0.02)
x$y <- c(y0, y1)
p.Reg[loop] <- summary(glm(y ~ x, data=x))$coef[8]
p.Poisson[loop] <- summary(glm(y ~ x, data=x, family="poisson"))$coef[8]
p.Logistic[loop] <- summary(glm(y>0~ x, data=x, family="binomial"))$coef[8]
p.Polr[loop] <- summary(clm(as.factor(y)~ x, data=x))$coefficients[length(summary(clm(as.factor(y)~ x, data=x))$coefficients)]
}
mean(p.Reg < 0.05)
mean(p.Poisson < 0.05)
mean(p.Logistic < 0.05)
mean(p.Polr < 0.05)
|