I think this is a theoretical question as much as a practical preprocessing question.
I am not a mathematician, so apply the appropriate amount of salt to this answer.
The implicit model is:
Y=Cx + Qe.
where Y is the normalized signal from a few adjacent voxels, x is an underlying signal of interest, C is a mixing vector, and e is a matrix of some additive error mixed by Q. We're assuming we're looking for a single signal. If you take the mean of Y as an estimate of x you are assuming that there is a single x represented in each y to the exact same degree (C is a constant vector) and that the e's normally distributed, uncorrelated in time, across observations (Q is diagonal), and uncorrelated with x, and all y's e have equal variances. If the Y's are not normalized the mean assumes the elements of C are scaled by the variances of Y, but the error still have the same variance. If you take the data projected on the first eigenvector you are assuming the remaining eigenvectors with non-zero eigenvalues are noise. The signal x is not represented equally across voxels (C is not a constant vector) butt x is the largest unique source of variance in Y. The remaining e's are all uncorrelated with equal variance though Q is not constant and the different noise sources can be shared across voxels at different levels. If you take the first factor (from factor analysis) you are assuming that the e' are normal and uncorrelated, but with potentially different variances. Q would be diagonal I think. Other options would be ICA which relaxes the normality assumption (or equivalently changes Cx to a nonlinear function), or some sort of linear dynamic system to relax the uncorrelated through time assumptions (partitioning the state space into signal and noise).
If you smooth, there is a second mixing matrix G such that
Y=G(Cx + Qe)
which will likely have the effect of making the PCA, FA, etc models look a lot more like the first (mean) model. Still, the notion that all voxels represent the signal x to an exactly equal degree seems a bit dodgy to me. Since the PCA and mean solutions are often correlated, the PCA solution is likely better theoretically. Check the actual eigenvector to ensure that all voxels have positive values (I can't think of a case I've ever seen where they don't). If they are all positive, the "different voxels for different people" problem is avoided.
Take a look at Roweis and Ghahramani 1999 in neural networks for a better discussion of the above.
Just my 0.02$
Jason F. Smith, Ph.D.
Brain Imaging and Modeling Section
National Institutes of Health
[log in to unmask]
Rm 8S235B, 10 Center Dr.
Bethesda MD 20892-1407