Excellent post, if I may say so!
A comment or two:
>>2. The current MTZ file format, both unmerged and merged, has no slot for these error estimates, and despite some moans from me and others over the years, nobody really wants to change the format as it might break lots of things. However, DIALS has a pretty complete (and extensible) model of the diffraction experiment and results, including time-dependent cell dimensions if required, and there is at least preliminary work in representing this model in a Nexus/HDF5 file as a replacement for the unmerged MTZ file, and I think this should be adopted for downstream use of unmerged data. Use of unmerged data in refinement would allow proper allowance for time(dose)-dependent structural changes, partly overlapped multiple lattices and twin fractions which vary with crystal rotation.
The use of unmerged data would be really, really excellent for many things! I would push further, however, maybe even skipping trying to do refinement on unmerged data: we should try refining against *images*! The hardware is ready and willing now, I think. Imagine, indexing/integration/merging/scaling could all be tuned directly in refining the model. Compared to the number of parameters in a protein structure model, the number of parameters would be relatively diminutive (they are there anyway but somehow off the radar the way we do things currently). And of course, various lattice parameters like mosaicity and diffuse scattering between the spots could be thrown in later for good measure. Refinement would again take long enough to get a cup of coffee! Wouldn't it be awesome? (Not just the coffee part).
>>4. In the past when the wavelength of some synchrotron beamlines was less reliably known than it is now, I did try refining a multiplier on the cell lengths as a proxy for wavelength, in a script, and observed a very flat minimum in Rfree, trading X-ray fit for geometric fit. This suggested to me that with care it might be possible to refine at least one cell parameter, but that refining 6 parameters of a triclinic cell is likely to be dangerous, except possibly at very high resolution. However at very high resolution the cell dimensions from data processing are likely to be more accurate anyway.
Maybe the trade off with x-ray fit and geometry was in fact due to errors in wavelength--perhaps what could be done in cases like this is to adjust wavelength such that there was no trade-off between geometry and x-ray fit?
Also, the danger in cell refinement could be averted by restraining the maximum change to something reasonable, or use whatever restraints/algorithms the various processing packages already use.
Thank you for the panoramic picture informed by years of experience,
Jacob
|