Andrzej, I had the same thought earlier this year. There are several problems:

1) There is no good non-proprietary, truly cross-platform solutions yet. CUDA is very specific to NVidia's hardware, obviously. Apple claims to be adding generic support for GPU acceleration in their next OS release - they call it "OpenCL":

Good luck finding out anything more, but personally, I'm waiting until this is available before I invest any of my free time learning CUDA.

2) Most GPUs only deal with single-precision floating point calculations. I've seen many claims that this doesn't matter as much as people think, but it's still a huge barrier to acceptance. (I believe this will change later this year in the high-end NVidia cards.)

3) Like Tassos says, most crystallographic apps are far more complex than a pure MD simulation. In many cases there's more to be gained from process-level parallelism (running Phaser on several dozen or several hundred search models, or trying to autobuild into ten different datasets collected remotely on a synchrotron beamline) than calculation-level parallelism. The actual run time for most of these processes is also orders of magnitude less than most MD simulations.

I am dubious
that averaged over a 5-year period the investment in specialized code
development and expensive specialized hardware was a net win.

In this case, the hardware is actually extremely cheap relative to the performance gain. It doesn't necessarily even require specialized additional hardware, since the NVidia chip in the MacBook Pro is capable of running CUDA apps. NVidia will happily sell you cards with graphical output disabled for running calculations, but this isn't a requirement.

A similar alternation has been seen historically in the tradeoff between
self-contained platforms and external computational resources accessed
by relatively dumb terminals. For some years now computer use has
favored self-contained laptops or graphics workstations. But now the
pendulum is swinging back again.

Really? When you can buy an 8-core workstation with a powerful (albeit non-stereo) graphics card for as little as $2500? And an external terabyte hard drive for $200? Some labs could buy a couple of these and be totally satisfied. Even the now-obsolete laptop I bought at the start of grad school would have been enough for my own crystallographic needs if it weren't for the tiny (40GB) hard drive.

This time around the external resource is distributed ("cloud computing")
rather than centralized, but in essence it's "déjà vu all over again".
Whether the cloud computing fad will extend to crystallography remains
to be seen. Note that distributed ("cloud") data storage has been
seriously proposed as a possible solution to the problem of archiving raw
diffraction images.

In that case it's clearly a technological need. But I think most movement towards distributed computing is more likely to be logistical and financial - massive numbers of CPUs are relatively cheap, but housing and administering them is not.