Dear Angela,

I have two variables of interest that are potentially confounded.  One is
RT and the other is an attentional variable.  That is, the attentional
might be expected to light up both because the attentional variable is
requiring them to attend more, or because they're attending longer for
whatever reasons.  Now it turns out that when the attentional variable
requires them to attend more, the RT is on average slower as well.

Hence the potential confound.

What I tried first was to enter two regressors, the first for RT, and the
second for the attention variable.  When you look at the orthogonality of
the DM, the correlation coefficient is around 0.1, so not too high.  And I
see that the same set of areas lighting up for both contrasts.  Should I
be worried?

No; this is perfectly OK and suggests that there are attentional effects that
cannot be explained by RT and RT effects that cannot be explained by
attention. Interestingly these effects co-localize in the areas that are
common to both contrasts.  Using both explanatory variables in
the same linear model means that any test for the effect of one (in
the presence of the other) is testing only or effects that can be explained
uniquely.

I am assuming here that you specified one event-type and used two
regressors; RT and attention as parametric modulators in a first-level design.
SPM will orthogonalise RT w.r.t. attention.  This means the results for
attention will cover any redundancy between the two variables.  The reason
you have a small correlation  between the regressors is that the orthogonalisation
is implemented before convolution with the HRF.

Well, I also tried doing it a different way, which is to make the basic
variable (target onset) to have a duration of RT for that trial instead of
0, and then only one regressor for the attention variable, and there again
I see the attentional areas light up for the attention variable contrast.
These two methods should be fairly similar, right?

Yes, they should give very similar results because parametric
modulation of a fixed event-related response by RT is almost identical
to convolving a variable duration response.  This is because the RT is
small in relation to the length of the hemodynamic response function.

However, if you modulated this variable -uration stimulus function with
attention, you are effectively modeling the interaction between RT
and attention.  Because the RT was not mean corrected (when specifying
the durations) this interaction will contain the main effect of attention.

Then I tried a third way (just to convince myself), which is to swap the
order of the regressors (now with the attentional variable first, and RT
second), and now although I still get the same general areas lighting up
for both, the p-val's for both contrasts are much worse!  How bizzare. 
Why should that have happened?  If the p-values got better for one, and
worse than the other, then I might be able to understand -- since
presumably the order of orthogonalization matters.  But how can they both
get worse??

This should not happen.  As you point out the order can only affect the orthgonalization.
You can check this by performing an F-test across both effects.  You should get
the same results. irrespective of the order.  I am guessing there is some other
error in model specification that has crept in here.

I would use the first analysis because it is the simplest.

I hope this heslps - Karl