In my humble opinion there's not much you can do about this. (I've seen the same issue in my own lab.)
You _can_ meaningfully distinguish between A1 and A2, for example. But it's really not possible to distinguish between A and A1, for example. Not at the levels of noise in fMRI and the amount of temporal "smearing" created by brain hemodynamics.
If there's a very long time gap between the two phases of the trial (in your case, between looking and reacting), then it might not be an issue. But it would have to be pretty long (a few seconds).
One solution I've seen in the literature is to have "incomplete" trials. E.g. where looking A isn't following by any instruction for reacting. (Ie subjects get a cue to not react.) I.e., try to have a random subset of trials have a first phase but not a second phase. But I don't think that's a good solution either, because by not having a second phase you're inducing a new brain state, and the hemodynamics of that new state will "contaminate" the first phase's hemodynamics. So you won't be able to claim these trials are "pure A, followed by no reaction"; they're rather really "A followed by a third type of reaction."
So at least with fMRI I think you're stuck.