Dave,
 
This is very interesting.  I'm an experimental psychologist by training and came to statistics rather late (age 36) in my career as a major academic interest.  I would be most interested in any more of these "historical notes" you may have.  My colleague P. Michial Politano are writing a beginner's text "Statistics and Experimental Design" for Alyon and Bacon.  I have found in over 35 years of teaching that students remember more when they know the historical reference for the discovery that is now a "point and click".  I would greatly appreciate any reference you have to this historical stuff, even if that reference is you!
 
However, for my personal interest, I wonder if you could tell me why 6 and  (less often) 12 show up so often in non-parametric statistics as constants in that special field of statistics?
 
Looking Forward,

Dennis
[In the beginning there was the Sponse...

[Dennis L. Edinger, Ph.D.]  -----Original Message-----
From: Concerned with the initial learning and teaching of statistics [mailto:[log in to unmask]]On Behalf Of Saville, Dave
Sent: Monday, May 28, 2001 10:13 PM
To: [log in to unmask]
Subject: R A Fisher 'discovered' degrees of freedom !

Hi Pedro, Erich and all!  My understanding is that the concept of 'degrees of freedom' became necessary in the 1915-1935 period when statistical methods were being introduced which took account of the effect of sample size (methods such as paired and independent samples t tests, regression and analysis of variance).  Prior to that, methods were "large sample size" methods. 
 
The first "small sample size" method was Student's t test, introduced by Gosset (pseudonym Student).  Sir Ronald Fisher produced an elegant proof for this (paired samples) t test using n-dimensional geometry.  In n-space, one direction is associated with the mean, and the remaining (n-1) perpendicular directions are associated with the variance.  To estimate the mean, you project the "data vector" onto the first direction.  To estimate the variance, you project the "data vector" onto each of these (n-1) directions, square the projection lengths, and average (or sum and divide by n-1).  This is the concept - the arithmetic can be rearranged to the formula given by Erich.  Anyway, the (n-1) is EXACT, not an approximate entity.  This is very obvious when you look at the geometry, but not obvious other ways.....
 
Fisher found that his geometric proofs were not universally understood, so he introduced the words "degrees of freedom," "sums of squares" and so on, and gave algebraic formulae for entities which were really dimensions of subspaces, sums of squared projection lengths, and so on. 
 
The idea of thinking in n-space may sound daunting, but it's not too bad really.  I start off in 2-space, then 3-space, to get the basic ideas, then n-space follows OK.  All the geometry you need is taught in the first few weeks of linear algebra at university first year level (or I teach it to agriculturalists in one 50-minute session).
 
Each year I run a non-geometric, heuristically-based introductory "Basic Stats" workshop which lasts for 3 whole days, for workers in agricultural research.  When I cover estimation of the variance, I too use the idea that Erich mentioned, that:
"sum(xi-xbar)^2 < sum(xi-m)^2 except when xbar = m"
and hence the left side needs a smaller divisor (n-1 instead of n).  However, I also mention that the n-1 comes from the geometry, and draw a right-angled triangle depicting data vector (hypotenuse), mean vector and error vector.  I mention that Pythagoras' Theorem a^2 = b^2 + c^2 gives the various "sums of squares," and the fact that the n-1 is the dimension of the subspace in which the "error vector" can vary.  I say that further explanation is outside the scope of the workshop - nevertheless, people really like to know that there is some decent maths behind all the methods that I teach them in an intuitive sort of way, and I always get good feedback on this aspect.
 
In my view, the basic maths to which I refer has not been properly described in a down to earth practical manner.  So my friend Graham Wood and I wrote a couple of books about it.  The introductory one is as follows:
Saville, D. J.; Wood, G. R. (1996).  Statistical Methods:  A Geometric Primer.  New York, Springer-Verlag, 268 pp.  ISBN 0-387-94705-1.
It describes the maths behind paired and independent samples t tests, regression and analysis of variance, starting in 2-space and building up.  If anyone is interested further, email me if you like.

Dave Saville
Biometrician        Phone: +64-3-983 3978
AgResearch        Fax:     +64-3-983 3946
Gerald Street      Email:   [log in to unmask]
P O Box 60
Lincoln 8152, Canterbury
New Zealand