Hi Cristina,
The observations about CTF effects spreading signal outside the box are
correct and should be heeded.
About your observations: smaller box sizes have relatively more signal
in the box and less noise. Therefore, if you don't use
--solvent_correct_fsc, then the unmasked FSC curve that is used inside
relion_refine will underestimate the signal more for larger boxes. This
can then lead to less accurate alignments and thus lower resolutions.
This was one of the reasons to introduce --solvent_correct_fsc, which
will basically do a post-processing like correction of the FSC curve
during refinement to take into account that the signal only lives in a
small part of the box. Could ti be your observations are without this
argument?
HTH,
Sjors
On 03/20/2018 04:31 PM, Cristina Paulino wrote:
> Dear all,
> I am actually very glad that this topic was brought up as I have been
> puzzled for a while looking at the box sizes of published structures.
> My impression is the the box sizes are decreasing per publication, and
> it became almost normal in my eyes to see box sizes where the
> particles almost touch the box edges. (Unfortunately more and more
> publications don’t specify the size of their box size in the material
> and methods nor in the cryo-EM table).
> Me and my group have been playing around with this for a while and I
> always was more on the big-box-size-side, exactly with the intention
> of including all delocalised high resolution information.
> However, I can confirm that the resolution given from an FSC plot
> often improves quite a bit when using a smaller box size. And this
> intrigues me even more. I thought it might be because a smaller box
> size masks more noise out and this might be of greater benefit than
> including high-resolution of highly defocused particles in the
> dataset. But shouldn't the masking out of noise be covered by the
> final mask in the post-processing step? And again I am puzzled.
> Even if the FSC resolution gets better, and the map looks ok one clear
> consequence of going to a final smaller box size is that while the
> center of your particle will be less affected, you will exclude more
> high-resolution information for protein feature at the edges of your
> particle as their delocalised information will be further away. This
> also explains (partially) the typical appearance of local resolution
> maps with a high-resolved protein-core and a less resolved protein-edge.
> So how should we proceed here? I am afraid most people would always go
> for the approach that gives the better FSC resolution, even though
> this might sometimes not be the best approach to treat your data.
>
> Best,
> Cristina
>
>
>
>
>> On 20 Mar 2018, at 4:48 PM, Leonid Sazanov <[log in to unmask]
>> <mailto:[log in to unmask]>> wrote:
>>
>> Hi, this is from Rosenthal and Henderson JMB 2003 paper, page 726:
>>
>> Minimal box size to achieve resolution d, for particle with diameter
>> D, with defocus dF and wavelength L (~0.02 for 300 kV):
>>
>> Box = D + ( 2 x L x dF / d ),
>>
>> i.e. for particle 200 A, max defocus in dataset 2.0 um and to
>> achieve resol 3.5 A => 200 + ~230 A = 430 A box.
>>
>> Best,
>>
>> Leonid
>>
>> Prof. Leonid Sazanov
>> IST Austria
>> Am Campus 1
>> A-3400 Klosterneuburg
>> Austria
>>
>> Phone: +43 2243 9000 3026
>> E-mail: [log in to unmask]
>> On 20/03/18 12:41, Teige Matthews-Palmer wrote:
>>> Dear Bum-Han,
>>>
>>> I agree, I think the background circle is just for normalisation. I
>>> never tried changing the default - do many people?
>>> So to answer Joshua’s question making the circle smaller shouldn’t
>>> help with the issue of duplicates. There is also the diameter of the
>>> circle/spherical mask during classification as you say, but I
>>> /think/ the mask isn’t fixed to the particle image during
>>> translation so again it won’t help. (Please correct me if wrong!)
>>>
>>> As for box-size, the previous discussion
>>> https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=CCPEM;2b838c96.1802 highlighted
>>> that the ability to make a larger translational search in a large
>>> box is a *disadvantage* because (especially poor autopicking) the
>>> boxes contain more than one particle and the search creates
>>> duplicate particles in your dataset. Undetected duplicate particles
>>> erroneously increase the FSC between random half-sets.
>>> The advantage of a large box is that you can be confident that the
>>> box contains all the high-spatial-frequency information of your
>>> particle (that is spread out in real-space by the
>>> point-spread-function); so when you perform CTF correction, all the
>>> high resolution info is available.
>>>
>>> I think the ideal box-size is a balance of: 1) large enough to
>>> contain all the PSF convolution of your particle; but 2) small
>>> enough that your dataset is practical to compute classifications.
>>> (So downsampling is helpful.)
>>> Does anyone know of a paper or webpage about estimating the size of
>>> the PSF to help make an informed choice when extracting a larger
>>> box? Bad picking and greater underfocus would need larger boxes.
>>>
>>> Personally, I don’t think that creating duplicate particles should
>>> be a driving reason to decrease box size; it’s surely better to
>>> check for duplicates as many have suggested. (P.S. scipion’s
>>> consensus pick might have the underlying ability I couldn’t get it
>>> to remove duplicate coordinates from a single coordinate set.)
>>>
>>> All the best, and grateful for any corrections,
>>> Teige
>>>
>>> 1) https://www.youtube.com/watch?v=GYDLhg49UQA&index=27&list=PL8_xPU5epJdctoHdQjpfHmd_z9WvGxK8- /The
>>> micrographs are an image of the specimen, convoluted by PSF, and
>>> since the contrast-transfer-function (CTF) is the Fourier transform
>>> of the PSF, when we CTF correct in Fourier space we hope to
>>> deconvolve the image from its PSF. If some of the spread-out signal
>>> of our particle is outside the box, then the CTF correction won’t
>>> restore the highest (most spread) spatial frequencies./
>>> /
>>> /
>>> /2) Information to help predict RAM usage for a given boxsize &
>>> N: https://www2.mrc-lmb.cam.ac.uk/relion/index.php?title=Calculate_2D_class_averages/
>>> /
>>> /
>>>> On 19 Mar 2018, at 15:34, 류범한 <[log in to unmask]
>>>> <mailto:[log in to unmask]>> wrote:
>>>>
>>>> Dear Sjors,
>>>>
>>>> I also wonder the question Joshua brought up.
>>>>
>>>> As I have understood (I am not sure I think this matter in right
>>>> way), setting background diameter circle is likely to define the
>>>> reference region for normalization, which makes signals seemingly
>>>> outstanding (?) and therefore contributes to find meaningful
>>>> signals or particles easier against background noise. In other
>>>> words, setting background diameter should be related to
>>>> background-quality (or SNR). When you play with low SNR or dirty
>>>> background micrographs, I think that adjusting background diameter
>>>> may give us better results.
>>>>
>>>> Meanwhile, I am not sure why the bigger box is advantageous (refer
>>>> to Ch.6 in Methods in Enzymology, Volume 579).
>>>> As I think, if the bigger box size can be better than the smaller
>>>> one, bigger box (two fold of the largest particle dimension is
>>>> suggested) should guarantee more possibility to find and align
>>>> signals corresponding to a particle in the box (=conservative
>>>> TRANSLATION?), especially when the center of mass is hard to be
>>>> defined due to improper autopicking or noisy micrographs.
>>>>
>>>> All these are just my views. I am not sure how much my ideas are
>>>> close to the fact. Am I thinking in right way to the above?
>>>>
>>>> Any comment is welcomed !
>>>>
>>>>
>>>> Best,
>>>> Han
>>>> *
>>>> *
>>>> *
>>>> *
>>>> *Bum Han Ryu*
>>>>
>>>> Postdoctoral Researcher
>>>> Korea Basic Science Institute (KBSI)
>>>> 161 Yeongudanji-ro, Ochang-eup, Cheongwon-gu, Cheongju-si
>>>> Chungcheongbuk-do, Republic of Korea
>>>>
>>>> Office) 043-240-5329
>>>>
>>>> <한국기초과학지원연구원CI_영문가로형
>>>> (축소).jpg>
>>>>
>>>>> 2018. 3. 3. 오전 7:02, Joshua Lobo <[log in to unmask]
>>>>> <mailto:[log in to unmask]>> 작성:
>>>>>
>>>>> Hi CCPEM
>>>>>
>>>>> I'm sorry to bring this discussion up again as in the link
>>>>>
>>>>> https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=CCPEM;2b838c96.1802
>>>>>
>>>>> and yes, I definitely wholeheartedly agree with Bjorn's excellent
>>>>> points.And I also agree that the ideal case would be extracting a
>>>>> single particle per box. But in a sample where there are crowded
>>>>> particles would changing the diameter of the background circle
>>>>> during extraction be a better way than extracting with a bigger
>>>>> box and masking out only the single particle of interest. in the
>>>>> 2D classification step?
>>>>>
>>>>>
>>>>> Sincerely
>>>>> Joshua Lobo
>>>>
>>>
>>> The Francis Crick Institute Limited is a registered charity in
>>> England and Wales no. 1140062 and a company registered in England
>>> and Wales no. 06885462, with its registered office at 1 Midland Road
>>> London NW1 1AT
>>>
>>
>
--
Sjors Scheres
MRC Laboratory of Molecular Biology
Francis Crick Avenue, Cambridge Biomedical Campus
Cambridge CB2 0QH, U.K.
tel: +44 (0)1223 267061
http://www2.mrc-lmb.cam.ac.uk/groups/scheres
|