Dear Robert,
Thanks for sharing your test data.
It is interesting to see that the compression ratio is similar from uint8
to uint16. You did not list uint4 directly, and said that the compression
ratio is similar between uint4 and uint8. Given this, there is little gain
from using uint4 and it is better to let DM take care the software gain
reference. The 270 degree rotation and Y-flipping, as suggested online,
would be taken into account behind the scene. We will test these in our
new system.
As to the dose range, from the measurements reported by Li et al 2013, the
highest counting efficiency is ~0.97 at 5 electrons per physical pixel
(ppx) per sec. This number goes down significantly at 10 e/ppx/s, but was
still high (~0.94) at 7.5 e/ppx/s. It is interesting that your data show
that less than 5 appears not to make it any better out of noise. The
missed events may come from real miss or 2-electron events. Given the low
error, there might still be space to improve the counting efficiency to
approaching 1.0.
Thanks again for sharing.
Qiu-Xing
On 3/16/16, 8:58 AM, "Robert McLeod" <[log in to unmask]> wrote:
>Dear Qiu-Xing,
>
>We don't round our data to integers. We let Gatan do the gain reference
>and don't try to keep track of it. This lessons the overhead in managing
>data a bit. We have both GMS3 and GMS2 though.
>
>Here are some compression numbers for a representative stack (using
>lbzip2):
>
>raw DM4: 3.41 GB
>compressed DM4: 1.45 GB
>compressed uint8: 306 MB
>compressed uint16: 316 MB
>compressed float32, rounded to 0.1 decimals: 647 MB
>compressed float32, rounded to 0.01 decimals: 989 MB
>
>As you can see, uint8 doesn't compress better than uint8. I doubt that
>uint4 would compress better than uint8 with most compression algorithms.
>uint4 isn't a machine data type, it's actually two numbers shoved into a
>unit8 by bit-shifting. Compression algorithms can do a better job of
>this. I think I mentioned previously, we use lbzip2, pigz, or 7-zip for
>compressing data.
>
>If code was written with the idea of applying the gain reference while
>the data was in the CPU cache then there might be a performance advantage
>to keeping the gain reference separate. However, Frealign, Relion, etc.
>round integer data to floating-point on loading from disk so I don't see
>a significant advantage.
>
>The gain reference's 'gain factor' (literally the number of images used
>to build the counting gain reference) changes the permissible dose rate.
>For a gain factor of 100, a dose rate of 4 electrons per pixel per second
>is indistinguishable from poisson noise, i.e. I can't detect correlated
>noise. Y. Cheng has recommended in manuscripts less than 8 electrons per
>pixel per second to avoid co-incidence loss (double counting problems).
>Personally, since I know the gain reference changes very quickly, I
>recommend a more conservative 4-5.
>
>For uniformly distributed error, rounding to 0.01 electrons should not
>negatively impact an image stack taken with GF100
>
>Robert
>
>--
>Robert McLeod, Ph.D.
>Center for Cellular Imaging and Nano Analytics (C-CINA)
>Biozentrum der Universität Basel
>Mattenstrasse 26, 4058 Basel
>Office: +41.061.387.3225
>[log in to unmask]
>[log in to unmask]
>[log in to unmask]
>
>________________________________________
>From: Jiang,Qiu-Xing [[log in to unmask]]
>Sent: Saturday, March 12, 2016 3:10 AM
>To: Robert McLeod; Collaborative Computational Project in Electron
>cryo-Microscopy
>Subject: Re: [ccpem] 16 bit vs 32 bit
>
>Dear Robert and others interested in this thread,
>During our recent installation of a new K2 summit, Gatan pointed us to
>the info at the serialEM site.
>http://bio3d.colorado.edu/SerialEM/hlp/html/about_camera.htm#directDetecto
>rs
>From the info online and what you explained, we are planning to collect
>super resolution /dose fractionation data in 40 frames per exposure with
>40 electrons/pixel spread in 8 seconds. The magnification we use will
>give 5-10 electrons per physical pixel per second. The packed
>unnormalized super-resolution data, which have hardware corrections (
>(subtraction of dark ref. plus gain normalization), contain the raw
>electron counts. At this point the frame data are not rotated nor
>flipped, and each pixel is 4-bit , and can be compressed with a good
>ratio (say LZW tiffs). The size of the output data per exposure will be
>~1/10 of that (~9GB) of the ones delivered from default operations. From
>post-processing, unpacking, frame rotation / y-flipping, normalization
>with software gain reference (16-bit) and correction of defects will be
>used to yield the normal 16-bit data for analysis.
>As all of those I have talked to are using their K2 summit in counting
>mode, I wonder if any one in this broad list is doing similar operations
>and can help us verify the sequence of these steps before our
>implementation.
>
>Thanks.
>
>Qiu-Xing
>
>
>
>
>From: Collaborative Computational Project in Electron cryo-Microscopy
><[log in to unmask]<mailto:[log in to unmask]>> on behalf of Robert
>McLeod <[log in to unmask]<mailto:[log in to unmask]>>
>Reply-To: Robert McLeod
><[log in to unmask]<mailto:[log in to unmask]>>
>Date: Thursday, March 10, 2016 at 3:26 AM
>To: Collaborative Computational Project in Electron cryo-Microscopy
><[log in to unmask]<mailto:[log in to unmask]>>
>Subject: Re: [ccpem] 16 bit vs 32 bit
>
>All,
>
>Gatan outputs the data as 32-bit floating point because it applies two
>gain references, one before counting and one after. The second gain
>reference reduces electrons to some fraction (over a range of about
>0.9-1.1).
>
>The process as I understand it is as follows:
>
>1.) Apply hardware dark reference subtraction
>2.) Apply hardware gain reference normalization (this also has a hot/dead
>pixel mask step)
>3.) Count (threshold) electrons (with center of mass fitting)
>4.) Integrate fast-frames to desired exposure time (uint16)
>5.) Move data to Gatan PC
>6.) Apply (super-resolution) gain reference (float32)
>
>If you round your data to the nearest integer, you must keep the software
>gain reference and apply it later before any processing. The .m2
>reference is the right one for 4k, and .m3 for 8k. Otherwise you will
>have substantial correlated noise in your summed stacks.
>
>FYI, there's no difference between uint-4. uint-8, uint-16, or uint-32
>after it's been compressed losslessly (e.g. by pigz or lbzip2). All that
>matters is the number of unique values (i.e. the histogram). Similarly,
>floating-point data that's been rounded to the nearest 1/100th of an
>electron compresses by about 3.5:1, compared to about 1.5:1 for a typical
>.DM4 image stack. In comparison integer data will compress about 4.5 -
>6.0:1, depending on the dose rate.
>
>Robert
>
>--
>Robert McLeod, Ph.D.
>Center for Cellular Imaging and Nano Analytics (C-CINA)
>Biozentrum der Universität Basel
>Mattenstrasse 26, 4058 Basel
>Office: +41.061.387.3225
>[log in to unmask]<mailto:[log in to unmask]>
>[log in to unmask]<mailto:[log in to unmask]>
>[log in to unmask]<mailto:[log in to unmask]>
>________________________________
>From: Collaborative Computational Project in Electron cryo-Microscopy
>[[log in to unmask]<mailto:[log in to unmask]>] on behalf of Benoît
>Zuber [[log in to unmask]<mailto:[log in to unmask]>]
>Sent: Wednesday, March 09, 2016 6:56 PM
>To: [log in to unmask]<mailto:[log in to unmask]>
>Subject: Re: [ccpem] 16 bit vs 32 bit
>
>Hi edoardo,
>
>If you work in counting mode and you save frames, in most cases 8 bit
>(256 levels of gray) should be more than enough, probably even 4 bit (16
>levels) should be good if you have small pixels (I.e. high
>magnification). I think there was a discussion about that on 3dem mailing
>list. You could search the archive there.
>
>Best
>Benoît
>
>Le 9 mars 2016 à 18:45, HugoMH
><[log in to unmask]<mailto:[log in to unmask]>> a écrit :
>
>Hi Edoardo,
>
>Very interesting question. I guess that data in 16 bits is not the
>limitation step for high resolution. 16 bits means 65536 different greys
>on the other side 2^32 is a huge number, hopefully 16 bits is enough for
>high-resolution.
>
>In this scenario I just wonder if single particle programmes can really
>process the data at 32 bits or not.
>
>Best wishes
>
>Hugo
>
>--
>Hugo Muñoz Hernández, PhD student
>
>Centro de Investigaciones Biológicas CIB-CSIC
>Consejo Superior de Investigaciones Científicas
>C/ Ramiro de Maeztu, 9
>28040 Madrid (Spain)
>
>http://www.cib.csic.es/es/grupo.php?idgrupo=47
>
>Tel: +34 91 8373112 ext 4436, Lab.B-47
>
>
>
>On 09/03/16 18:01, Edoardo D'Imprima wrote:
>
>Dear all,
>
>I have a general question about cryo-data collection with direct
>detectors: does it make any difference in terms of high resolution image
>processing to collect movies in 16 bit vs 32 bit? In principle one can
>save quite a lot of storage space but I¹m not so sure that this procedure
>will be harmless. Are the high resolution components somehow damaged
>during the normalisation or particle alignment?
>
>Is there any experience regarding this topic?
>
>Any suggestion will be very appreciated, many thanks in advance.
>
>Edoardo
>---------------------------------------------------
>Edoardo D'Imprima
>PhD Student
>Max Planck Institute of Biophysics
>Structural Biology Department
>Max-von-Laue Straße 3
>60438 Frankfurt am Main
>Germany
>
>Tel: +49 (0) 69 6303 3015
>
>
>
>--
>Hugo Muñoz Hernández, PhD student
>
>Centro de Investigaciones Biológicas CIB-CSIC
>Consejo Superior de Investigaciones Científicas
>C/ Ramiro de Maeztu, 9
>28040 Madrid (Spain)
>
>http://www.cib.csic.es/es/grupo.php?idgrupo=47
>
>Tel: +34 91 8373112 ext 4436, Lab.B-47
>
|