Thank you very much for the responses to the following question:
>>When I compare observed and estimated frequencies the Chi-Square Test is
>not independent from the unit of scale. The >frequencies represent the
>interregional migration flow and the values are big.This implies that the
>test is always >significant (this means that I can not accept the model
>>Is there another test? What can I do?
I would like to share all the information and hopefully they can be as
useful to someone else.
>Here's a summary:
>1) Basically, what is happening here is that on very large numbers, the
>chi-square test (and indeed any test) is sensitive to detect very
>small effects. Rather than putting the main emphasis on hypothesis
>testing - which asks the question "Is there evidence for a
>difference?" - it's worth considering characterising your association
>or difference by some relevant quantity, and putting a confidence
>interval around this. For a simple 2 by 2 table
>the most usual measures are:
>Rate difference a/(a+c) - b/(b+d)
>Rate ratio (a/(a+c))/(b/(b+d))
>Odds ratio ad/bc.
>The rate difference and rate ratio are appropriate when you are
>contrasting two groups, whose sizes (a+c and b+d) are given. The
>odds ratio is for when the issue is association rather than
>difference. Confidence interval methods are available for all of
>these - though not as well available in software as should be. If
>the hypothesis test is highly significant, the confidence interval
>will be well away from the null hypothesis value (0 for the rate
>difference, 1 for the rate ratio or odds ratio).
>Hope this helps.
>2) This seems to me to be the old stroy that when one has a very large
>number of observations then almost every null hypothesis will be
>It comes down to deciding whether the differences are practically
>important rather than statistically significant. For example if you are
>dealing with migration flows of hundreds of thousands or more and your
>model is only a thousand or so out in predicting the observed values
>then I would say that it was pretty good whatever a chi-squared test
>says. In my main field of interest which is medical statistics we
>distinguish carefully between statistical significance and medical
>importance on the grounds that a reseacher by conducting a large enough
>study will always come up with a significant difference but the
>difference may not be medically important.
>I hope this helps. If not you might want to tell us more about your
>probelm so that we are able to give better advice.
>3) it is a fact of life that null hypotheses are (almost?)
>never exactly true, and with large enough data sets you
>will inevitably reject the null hypothesis. You certainly
>should not look for another test which fails to reject the
>NH. This would simply indicate poor power for the test,
>not a correct NH.
>However, what *is* acceptable is to say that although you
>know that the NH is not true, the observed data are not
>so different as to be of practical (rather than statistical)
>significance. Hence you are prepared to work with the NH
>as a reasonable approximation to reality.
>4) The chi-square test in a contingency table of counts examines the null
>hypothesis of proportionality. The counts are assumed to behave as Poisson
>variates. As is reasonable, the power of the test increases with the amount
>of information, i.e. with the size of the counts. The only correct scale
>for the test is that on which the items in the table were counted. If you
>convert these to rates, (say per million) and then treat these as though
>they were counts, you will be artificially inflating the significance, so
>you are getting the wrong answer. If the data come from unequal sized
>populations and it is necessary to treat them as rates, you can still derive
>a correct test, which essentially treats your population sizes as weights.
>Look in a statistics book for a test comparing proportions or rates.
>A good general way to treat rates, particularly if you have a complex
>structure in the data set, is to model them within the GLM framework. The
>deviance-based tests are equivalent to chi-square tests. You can do this in
>GLIM, Genstat, S-Plus at least, and probably many other packages.
>Finally, don't get too hung up about significance. If you compare two very
>large populations, you can find that a very small difference in rates is
>statistically significant. But if it is very small, is it important?
>Depends, of course, on what you are looking at, and in what context
>Dr Brian G Miller