> I can't find any more recent work on this. Is this a common method?
Well, the concept of event-related designs is indeed really an old one. The articles in the Neuroimage issue "20 YEARS OF fMRI" by Clark (2012), Courtney (2012), Huettel (2012), Liu (2012), Petsersen & Dubis (2012) http://www.sciencedirect.com/science/journal/10538119/62/2 might give a good overview. Since then several additional articles have been published, e.g. Kao et al. (2014, World J Radiol, https://dx.doi.org/10.4329/wjr.v6.i7.437 ), Maus et al. (2012, Hum Brain Mapp, https://dx.doi.org/10.1002/hbm.21289 ).
> Geometric or exponential distributions of ITI
There are always two parts, the methodological-statistical perspective (efficient design) and psychological aspects. An efficient design won't be of any help
1) if it interferes badly with the subject.
1a) the long(er) intervals of the jitter are often implemented via "null trials" with no stimuli or just a fixation cross. If null trials are presented infrequently subjects might think they've missed a stimulus, turning the null trial into something accompanied by cognitive processes. E.g. Busse & Woldorff (2003, Neuroimage, https://dx.doi.org/10.1016/S1053-8119(03)00012-0 ) suggest to go with something like 25 - 33% of null trials, although this relates to fast-event related designs. Due to your design issues with violation of expectation should be less of an issue.
1b) For some tasks you might prefer fixed intervals (maybe some working memory task with another stimulus every few seconds, if you add jitter, then this likely increases attentional demands), even if this means the design becomes more inefficient. You could still optimize the design based on just varying the no. of repetitions of trials of the same type. However, this doesn't make sense in your case for differentiating between cue and outcome, as one cue is always followed by one outcome. Actually there might be an additional bias in your case as one cue seems to be associated three times as frequently with one of the outcomes compared to the other.
2) if the predictors are bad.
2a) Findings in fast event-related designs don't necessarily transfer onto designs in which you want to separate trial components, especially if this means sequentially dependence.
2b) If cue and outcome were completely unrelated it does make sense to rely on jitter and go with separate regressors based on fixed durations, thus e.g. modelling the visual input for each. But for cues this is quite unlikely (or if it were = resulting in a brief and then decaying activation, then it would probably not be an ineffective cue). In other words, you might have predictors which can be separated nicely, but possibly they are very bad predictors for the neural processes you want to look at. There's a review by Ruge et al. (2013, Hum Brain Mapp, https://dx.doi.org/10.1002/hbm.21420 ) on that issue.
3) if you analyze the data in another way / not like the experiment was designed. Maybe you want to go with a duration based on reaction times, which would introduce lots trial-to-trial variance to the outcome (which cannot usually be predicted in advance) but not to the cue.
In fast-event related designs the condition predictors will never/hardly ever turn back to baseline, thus you will not be able to properly estimate activation levels for conditions A, B relative to baseline. Even null events can be misleading if there's still some ongoing activation (e.g. see Stark & Squire, 2001, PNAS https://dx.doi.org/10.1073/pnas.221462998 ). But this should not affect differential contrasts like A - B. In your case it might be a drawback, as probably, the contrast "cue - outcome" is not that relevant, while outcome activations as such (vs. baseline) might be interesting. However, you still have the interval between the trials, so you should be on the safe side as long as the interval doesn't become too short.
Hope you'll find some more useful information in these papers