Dear Colleagues,
The main problem with a lossy compression that suppresses weak
spots is that those spots may be a tip-off to a misidentified
symmetry, so you may wish to keep some faithful copy of the
original diffraction image until you are very certain of having
the symmetry right.
That being said, such a huge compression sounds very useful,
and I would be happy to add it as an option of CBFlib for people
to play with once the code is reasonably stable and available,
and if it is not tied up in patents or licenses that conflict
with the LGPL.
Regards,
Herbert
=====================================================
Herbert J. Bernstein, Professor of Computer Science
Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769
+1-631-244-3035
[log in to unmask]
=====================================================
On Sun, 9 May 2010, James Holton wrote:
> Frank von Delft wrote:
>> Just looked at the algorithm, how it stores the average "non-spot" through
>> all the images.
>>
>> What happens with dataset where the "non-spot" (e.g. background) changes
>> systematically through the dataset, i.e. anisotropic datasets or thin
>> crystals lying flat in a thin loop? How much worse is compression for
>> that?
>> Cheers
>> phx
> Well, what will happen in that case (with the current "algorithm") is that
> once a background pixel deviates from the median level by more than 4
> "sigmas", it will start to get stored losslessly. Essentially, they will be
> treated as "spots" and the overall compression ratio will start to approach
> that of bzip2.
>
> A "workaround" for this is simply to store the data set in "chunks" where the
> background level is similar, but I suppose a more intelligent thing to do
> would be to simply "scale" each image to the median background image, and
> store the scale factors (a list of 100 numbers for a 100-image data set)
> along with the other ancillary data. I haven't done that yet. Didn't want
> to spend too much time on this in case I incited some kind of revolt.
>
> -James Holton
> MAD Scientist
>
>
>>
>>
>> On 07/05/2010 06:07, James Holton wrote:
>>> Ian Tickle wrote:
>>>> I found an old e-mail from James Holton where he suggested lossy
>>>> compression for diffraction images (as long as it didn't change the
>>>> F's significantly!) - I'm not sure whether anything came of that!
>>>
>>> Well, yes, something did come of this.... But I don't think Gerard
>>> Bricogne is going to like it.
>>>
>>> Details are here:
>>> http://bl831.als.lbl.gov/~jamesh/lossy_compression/
>>>
>>> Short version is that I found a way to compress a test lysozyme dataset by
>>> a factor of ~33 with no apparent ill effects on the data. In fact,
>>> anomalous differences were completely unaffected, and Rfree dropped from
>>> 0.287 for the original data to 0.275 when refined against Fs from the
>>> compressed images. This is no doubt a fluke of the excess noise added by
>>> compression, but I think it highlights how the errors in crystallography
>>> are dominated by the inadequacies of the electron density models we use,
>>> and not the quality of our data.
>>>
>>> The page above lists two data sets: "A" and "B", and I am interested to
>>> know if and how anyone can "tell" which one of these data sets was
>>> compressed. The first image of each data set can be found here:
>>> http://bl831.als.lbl.gov/~jamesh/lossy_compression/firstimage.tar.bz2
>>>
>>> -James Holton
>>> MAD Scientist
>
|