JISCMail - CCPEM Archives

Email discussion lists for the UK Education and Research communities
Subscriber's Corner
Email Lists
CCPEM Archives

CCPEM@JISCMAIL.AC.UK

View:

Message:
[
First
Last
]
By Topic:
[
First
Last
]
By Author:
[
First
Last
]
Font:
Proportional Font
		LISTSERV Archives
		CCPEM Home
		CCPEM March 2016
Options

Subscribe or Unsubscribe
Get Password
Subject:
Re: 16 bit vs 32 bit
From:
Robert McLeod <[log in to unmask]>
Reply-To:
Robert McLeod <[log in to unmask]>
Date:
Wed, 16 Mar 2016 12:58:12 +0000
Content-Type:
text/plain
Parts/Attachments:
text/plain (161 lines)
Dear Qiu-Xing,

We don't round our data to integers.  We let Gatan do the gain reference and don't try to keep track of it.  This lessons the overhead in managing data a bit.  We have both GMS3 and GMS2 though.

Here are some compression numbers for a representative stack (using lbzip2):

raw DM4:  3.41 GB
compressed DM4:   1.45 GB
compressed uint8:   306 MB
compressed uint16: 316 MB
compressed float32, rounded to 0.1 decimals: 647 MB
compressed float32, rounded to 0.01 decimals: 989 MB

As you can see, uint8 doesn't compress better than uint8.  I doubt that uint4 would compress better than uint8 with most compression algorithms.  uint4 isn't a machine data  type, it's actually two numbers shoved into a unit8 by bit-shifting.  Compression algorithms can do a better job of this.  I think I mentioned previously, we use lbzip2, pigz, or 7-zip for compressing data.

If code was written with the idea of applying the gain reference while the data was in the CPU cache then there might be a performance advantage to keeping the gain reference separate.  However, Frealign, Relion, etc. round integer data to floating-point on loading from disk so I don't see a significant advantage.  

The gain reference's 'gain factor' (literally the number of images used to build the counting gain reference) changes the permissible dose rate. For a gain factor of 100, a dose rate of 4 electrons per pixel per second is indistinguishable from poisson noise, i.e. I can't detect correlated noise.  Y. Cheng has recommended in manuscripts less than 8 electrons per pixel per second to avoid co-incidence loss (double counting problems).  Personally, since I know the gain reference changes very quickly, I  recommend a more conservative 4-5. 

For uniformly distributed error, rounding to 0.01 electrons should not negatively impact an image stack taken with GF100

Robert

--
Robert McLeod, Ph.D.
Center for Cellular Imaging and Nano Analytics (C-CINA)
Biozentrum der Universität Basel
Mattenstrasse 26, 4058 Basel
Office: +41.061.387.3225
[log in to unmask]
[log in to unmask]
[log in to unmask]

________________________________________
From: Jiang,Qiu-Xing [[log in to unmask]]
Sent: Saturday, March 12, 2016 3:10 AM
To: Robert McLeod; Collaborative Computational Project in Electron cryo-Microscopy
Subject: Re: [ccpem] 16 bit vs 32 bit

Dear Robert and others interested in this thread,
During our recent installation of a new K2 summit, Gatan pointed us to the info at the serialEM site. http://bio3d.colorado.edu/SerialEM/hlp/html/about_camera.htm#directDetectors
From the info online and what you explained, we are planning to collect super resolution /dose fractionation data  in 40 frames per exposure with 40 electrons/pixel spread in 8 seconds. The magnification we use will give 5-10 electrons per physical pixel per second. The packed unnormalized super-resolution data, which have hardware corrections ( (subtraction of dark ref. plus gain normalization), contain the raw electron counts. At this point the frame data are not rotated nor flipped, and each pixel is 4-bit , and can be compressed with a good ratio (say LZW tiffs). The size of the output data per exposure will be ~1/10 of that (~9GB) of the ones delivered from default operations. From post-processing, unpacking, frame rotation / y-flipping, normalization with software gain reference (16-bit) and correction of defects  will be used to yield the normal 16-bit data for analysis.
As all of those I have talked to are using their K2 summit in counting mode, I wonder if any one in this broad list is doing similar operations and can help us verify the sequence of these steps before our implementation.

Thanks.

Qiu-Xing




From: Collaborative Computational Project in Electron cryo-Microscopy <[log in to unmask]<mailto:[log in to unmask]>> on behalf of Robert McLeod <[log in to unmask]<mailto:[log in to unmask]>>
Reply-To: Robert McLeod <[log in to unmask]<mailto:[log in to unmask]>>
Date: Thursday, March 10, 2016 at 3:26 AM
To: Collaborative Computational Project in Electron cryo-Microscopy <[log in to unmask]<mailto:[log in to unmask]>>
Subject: Re: [ccpem] 16 bit vs 32 bit

All,

Gatan outputs the data as 32-bit floating point because it applies two gain references, one before counting and one after. The second gain reference reduces electrons to some fraction (over a range of about 0.9-1.1).

The process as I understand it is as follows:

1.) Apply hardware dark reference subtraction
2.) Apply hardware gain reference normalization (this also has a hot/dead pixel mask step)
3.) Count (threshold) electrons (with center of mass fitting)
4.) Integrate fast-frames to desired exposure time (uint16)
5.) Move data to Gatan PC
6.) Apply (super-resolution) gain reference (float32)

If you round your data to the nearest integer, you must keep the software gain reference and apply it later before any processing.  The .m2 reference is the right one for 4k, and .m3 for 8k. Otherwise you will have substantial correlated noise in your summed stacks.

FYI, there's no difference between uint-4. uint-8, uint-16, or uint-32 after it's been compressed losslessly (e.g. by pigz or lbzip2).  All that matters is the number of unique values (i.e. the histogram).  Similarly, floating-point data that's been rounded to the nearest 1/100th of an electron compresses by about 3.5:1, compared to about 1.5:1 for a typical .DM4 image stack.  In comparison integer data will compress about 4.5 - 6.0:1, depending on the dose rate.

Robert

--
Robert McLeod, Ph.D.
Center for Cellular Imaging and Nano Analytics (C-CINA)
Biozentrum der Universität Basel
Mattenstrasse 26, 4058 Basel
Office: +41.061.387.3225
[log in to unmask]<mailto:[log in to unmask]>
[log in to unmask]<mailto:[log in to unmask]>
[log in to unmask]<mailto:[log in to unmask]>
________________________________
From: Collaborative Computational Project in Electron cryo-Microscopy [[log in to unmask]<mailto:[log in to unmask]>] on behalf of Benoît Zuber [[log in to unmask]<mailto:[log in to unmask]>]
Sent: Wednesday, March 09, 2016 6:56 PM
To: [log in to unmask]<mailto:[log in to unmask]>
Subject: Re: [ccpem] 16 bit vs 32 bit

Hi edoardo,

If you work in counting mode and you save frames, in most cases 8 bit (256 levels of gray) should be more than enough, probably even 4 bit (16 levels) should be good if you have small pixels (I.e. high magnification). I think there was a discussion about that on 3dem mailing list. You could search the archive there.

Best
Benoît

Le 9 mars 2016 à 18:45, HugoMH <[log in to unmask]<mailto:[log in to unmask]>> a écrit :

Hi Edoardo,

Very interesting question. I guess that data in 16 bits is not the limitation step for high resolution. 16 bits means 65536 different greys on the other side 2^32 is a huge number, hopefully 16 bits is enough for high-resolution.

In this scenario I just wonder if single particle programmes can really process the data at 32 bits or not.

Best wishes

Hugo

--
Hugo Muñoz Hernández, PhD student

Centro de Investigaciones Biológicas CIB-CSIC
Consejo Superior de Investigaciones Científicas
C/ Ramiro de Maeztu, 9
28040 Madrid (Spain)

http://www.cib.csic.es/es/grupo.php?idgrupo=47

Tel: +34 91 8373112 ext 4436, Lab.B-47



On 09/03/16 18:01, Edoardo D'Imprima wrote:

Dear all,

I have a general question about cryo-data collection with direct detectors: does it make any difference in terms of high resolution image processing to collect movies in 16 bit vs 32 bit? In principle one can save quite a lot of storage space but I’m not so sure that this procedure will be harmless. Are the high resolution components somehow damaged during the normalisation or particle alignment?

Is there any experience regarding this topic?

Any suggestion will be very appreciated, many thanks in advance.

Edoardo
---------------------------------------------------
Edoardo D'Imprima
PhD Student
Max Planck Institute of Biophysics
Structural Biology Department
Max-von-Laue Straße 3
60438 Frankfurt am Main
Germany

Tel: +49 (0) 69 6303 3015



--
Hugo Muñoz Hernández, PhD student

Centro de Investigaciones Biológicas CIB-CSIC
Consejo Superior de Investigaciones Científicas
C/ Ramiro de Maeztu, 9
28040 Madrid (Spain)

http://www.cib.csic.es/es/grupo.php?idgrupo=47

Tel: +34 91 8373112 ext 4436, Lab.B-47
Top of Message | Previous Page | Permalink
JiscMail Tools

Files Area | help
RSS Feeds and Sharing

Search Archives

Advanced Options