Thanks for pointing to this paper, I had already looked at it when searching for literature on that issue but forgotten to mention it in the thread. There are several critical issues:
1) the high-pass filtered data is detrended linearly first to avoid wrap-around effects (which is not the case in SPM) which can be assumed to introduce a bias, as the other strategies don't include a linear detrending first.
2) The cut-off value 1/60 Hz for HPF is more aggressive compared to the other strategies. It removes low frequencies which their FOURIER and POLY models can't really account for. If their HPF reflects the implementated version in SPM and if we set up a model based on a duration of 16.5 min, then this would result in a DCT with ~ 33 functions (up to 16.5 cycles).
3) They state that their POLY strategy outperforms an HPF of 1/500 Hz by 2.8% (the implementation in SPM would result in a DCT with 4 functions, thus matching the no. of polynomial regressors). Now, how much of this advantage is due to possibly adverse effects due to the initial detrending?
4) The HPF is applied to the data during preprocessing, but what about the predictors / the FIR model? Looking at Fig. 8 the amplitude is lower for the filtered data. HPF applied to FIR predictors would result in temporal blurring, lower "peak vales". If the predictors are not filtered, then they would e.g. underestimate the peak amplitude relative to the other models, as the predictors would be "too high". This seems indeed to be the case, see page 146 "On the filtered data, we fit the simple version of the FIR model in which nuisance matrix S consists of a constant term." Thus no detrending, no HPF applied to the design matrix. This might also explain the negative estimates for the yellow events before event onset.
In summary, a re-analysis would definitely be useful.