Print

Print


Jesper Andersson wrote:

> Dear Torben and Alexandre,
>
> > I found an error in my previous mail, sorry. So once more:
> >
> > I have been looking at how similar columns in a multiple regression
> > model need to be, in order not to be estimable.
> >
> > The design matrix i have looked at is: X=[c1 c2 c3 mu]
> >
> > c1 and c2 were regressors with std=1 and mean=0, c3=c2+e*r (r is a
> > random vector with mean=0 ans std=1) and i have then
> > looked at the degrees of freedom as a function of "e". When "e" gets
> > below ~ 10^-13 the columns c2 and c3 becomes inestimable, and the number
> > of freedoms increases with one.
> > That is a bit late i think specially because multiple regression could
> > be used for designs having low degrees of freedom. Wouldn't one expect
> > that the degrees of freedom available for estimating the contrast
> > belonging to c1 should be the same no matter if e=0 or e=10^-13  ?
> >
>
> My understanding is that you would like to see some "soft transition" from
> df=n-4 to df=n-3 as e goes toward zero, right?
> I think the crucial point is that the space that is spanned by the design
> matrix will be given by [c1 c2 r mu] for any design of the form [c1 c2 c2+e*r
> mu] for as long as e is above the floating point tolerance of Matlab. It
> really doesn't matter how "large" (or small) a regressor is, it will still
> contribute one dimension to the design space.  Not until the regressor
> "vanishes completely" will the dimensionality of the design space decrease.
> However, when the regressor (or its variance) gets very small, the
> corresponding parameter estimates will get very large.
> If you look at the error with which the corresponding parameters are
> estimated you will see that as they get more and more colinear the
> correspoding errors will increase (as seen by the second and third element on
> the diagonal of inv(X'*X)). When e (in your example) is very small your
> covariance matrix of your paramter estimates will look something like
> [sn sn sn sn; sn BN -BN sn; sn -BN BN sn; sn sn sn sn]
> where sn denotes small number and BN denotes BIG NUMBER , i.e. the error in
> the estimates of the parameters corresponding to c2 and c3 will be very
> large, and negatively correlated. So, although they are "in principle"
> estimable, they will have very little meaning separately.
> I guess you are right though, it would be nice to have some sort of "warning"
> not only when columns are unestimable, but also when the error of the
> estimate of the corresponding parameter is so large as to render it "silly"
> or "meaningless". Still, that would also have to involve some "secret number"
> as a threshold for when the warning is to be issued. I guess common sense
> will still be the best guard against silly designs.
>
> Good luck Jesper

Dear Jesper

Thank you very much for your reply, you seemed to have got my point. So now for
the explanation of why would I like non-integer df's. I second level analyses
one uses contrast images from different persons whom are likely to have
different degrees of paradigm related motion. If one of the motion parameters
have correlation with the paradigm of more than some threshold (say 0.4 ), I
consider the contrast image belonging to the paradigm to uncertain, and exclude
it from the second level analysis. If some correlation between one of the motion
parameters are close to this threshold, I would like to let this information go
into the second level analysis, as a covariate. Clearly the six motion
parameters are not orthogonal, but in the current setup of df calculation they
will take 6 degrees of freedom. If I were to include a single correlation
coefficient in the second-level analysis I would need some apriori knowledge of
how to weight the different parameters, and here my common sense is not good
enough.

Torben