"The material I had read before (e.g., www.utdallas.edu/~herve/abdi-contrasts2010-pretty.pdf) just said that a linear combination of data groups is a contrast if the mean of the coefficients is zero. According to Ch7, which I’m inclined to trust more, a vector c is a contrast vector if it a linear combination of the rows of the design matrix X and is invariant under re-parameterization of the model. Mathematically, the second criterion is that c is unchanged when post-multiplied by (MATLAB notation) pinv(X’X)X’X."
First, a contrast isn't a linear combination of the rows of the design matrix. It's a linear combination of the parameters (i.e., the betas).
Second, I think there's a simpler way to look at estimable contrasts. If the design matrix isn't "degenerate," then all contrasts are estimable. What's degenerate? It's degenerate if the design matrix has a nontrivial null space N(X). Meaning, there are betas which themselves are not all zeros such that X*beta = 0. (NB: a more "official" definition is in Chapter 7 of HBF2.)
If the design is overparameterized, the null space will be nontrivial. (I say "nontrivial" and not empty, because the beta of all zeros is always in the null space.) That means no beta is unique, because you can always add something from the null space and get the same predicted data back. If the original regression is
Y = X*beta + e
and we have one estimated beta, beta^,
Y^ = X*beta^
then if beta_0 is in the null space of X, it's obvious that (beta^ + beta_0) produces the same Y^.
How does that relate to contrasts? The contrast itself is c'*beta^ (c' is the transpose of a column of contrast weights, c). Even if beta^ isn't unique when the design is overparameterized, we insist the c'*beta^ is unique (for a given Y). What does that entail? Adding in _any_ beta_0 from the null space should give the same contrast:
c'*(beta^ + beta_0) = c'*beta^
Clearly this shows that, for _any_ beta_0 in N(X),
c'*beta_0 = 0
Thus, a necessary and sufficient condition for a contrast to be estimable is that c is "in the orthogonal complement to N(X)." If you try a contrast other than those, then it's not well-defined, because there's an infinite number of allowed betas for a given Y, and c'*beta will depend on which beta is chosen.
Why does sum(c) = 0 come up so much? If 1 denotes the column vector of all 1's, then sum(c) = 0 if and only if c'*1 = 0. So this condition means that N(X) consists of the constant vectors. What kind of designs are those? Designs for which the columns of X sum to zero. Chapter 7, section 2.5.3, of HBF2 implies this comes up in ANOVA factorial designs.
"But what does SPM do if the answer is 'no'? Notify the user in some way? Give the user a chance to try again? Offer some sort of guidance in specifying a contrast that is proper? Automatically 'repair' the invalid contrast in some way (with or without notifying the user of the fact)?"
I'm not quite sure about doing things in SPM8 batch mode, but in earlier versions of SPM in interactive mode, it wouldn't let you create the contrast, and a warning would be placed into a textbox near the bottom of the contrast manager.
|