Dear all,
The uncertainty coefficient (or Theil's U) is a popular measure of association of
categorical variables derived from the field of information theory. I am looking for a
proper reference for this measure.
Many "recent" websites, books or statistical software describe it in its symmetrical or asymmetrical form (see for example: http://books.google.com/books?
id=1aAOdzK3FegC&printsec=frontcover&dq=numerical+recipes#PPA761,M1 or http://faculty.chass.ncsu.edu/garson/PA765/assocnominal.htm or http://www.jmu.edu/docs/sasdoc/sashtml/stat/chap28/sect20.htm)
They usually refer to a number of "old" works, including Shannon's initial work in
information theory [1], McGill's original article in using this theory as a tool for measures
of association [2], Theil's 1972 book which goes as far as introducing the "expected
mutual information" J(X,Y) = H(x)+H(Y)-H(X,Y) [3], and Goodman and Kruskal's 1979
classic (which actually only very briefly cites McGill's work in their 1959 paper) [4].
The funny thing is that none of these "old" references actually use the name "uncertainty
coefficient" or the letters "U" or "UC". None of them (even [3]) even actually provide the
equations described in the "recent" works, neither in a symmetrical or asymmetrical
version. After spending a lot of time trying to find who is actually at the origin of the UC
denomination and equations, I am about to give up and to decide, like everyone else, to
cite Theil's 1972 book as the one and only source. This message is my last effort in trying
to find the elusive scientist who actually formalized the "Uncertainty Coefficient U". Could
anyone help me?
Best regards,
Alex
[1] Shannon C and Weaver W., 1949. "the mathematical theory of communication".
Urbana IL: University of Illinois Press
[2] McGill W.J., 1954. "Multivariate information transmission", Psychometrika, 19, 97-116
[3] Theil H., 1972. "Statistical Decomposition Analysis". Amsterdam: North-Holland
Publishing Company.
[4] Goodman L. and Kruskal W., 1979. "Measures of association for cross classifications".
Springer-Verlag. 146p.
|