Wow, it is quite a lecture here! It is very appreciated.
I admit some (most?) of my statements were questionable. Thus, I did not know how sigI would be calculated in case of multiple observations, and, indeed, its proper handling should make <sigI/I> similar to Rmerge. Consequently, <I/sigI> substitutes Rmerge fairly well.
Now, where the metric Rmerge=0.5 came from? If I remember correctly, It was proposed here at ccp4bb. Also, one reviewer suggested to use it. I admit that this is quite an arbitrary value, but when everyone follows it, structures become comparable by this metric. If there is a better approach to estimate the resolution, lets use it, but the common rule should be enforced, otherwise the resolution becomes another venue for cheating.
Once again, I was talking about metric for the resolution, it does not need to be equal to metric for the data cutoff.
On Jun 3, 2012, at 2:55 PM, Ian Tickle wrote:
> Hi Alex
> On 3 June 2012 07:00, aaleshin <[log in to unmask]> wrote:
>> I was also taught that under "normal conditions" this would occur when the data are collected up to the shell, in which Rmerge = 0.5.
> Do you have a reference for that? I have not seen a demonstration of
> such an exact relationship between Rmerge and resolution, even for
> 'normal' data, and I don't think everyone uses 0.5 as the cut-off
> anyway (e.g. some people use 0.4, some 0.8 etc - though I agree with
> Phil that we shouldn't get too hung up about the exact number!).
> Certainly having used the other suggested criteria for resolution
> cut-off (I/sigma(I) & CC(1/2)), the corresponding Rmerge (and Rpim
> etc) seems to vary a lot (or maybe my data weren't 'normal').
>> One can collect more data (up to Rmerge=1.0 or even 100) but the resolution of the electron density map will not change significantly.
> I think we are all at least agreed that beyond some resolution
> cut-off, adding further higher resolution 'data' will not result in
> any further improvement in the map (because the weights will become
> negligible). So it would appear prudent at least to err on the high
> resolution side!
>> I solved several structures of my own, and this simple rule worked every time.
> In what sense do you mean it 'worked'? Do you mean you tried
> different cut-offs in Rmerge (e.g. 0.25, 0.50, 0.75, 1.00 ...) and
> then used some metric to judge when there was no further significant
> change in the map and you noted that the optimal value of your chosen
> metric always occurs around Rmerge 0.5?; and if so how did you judge a
> 'significant change'? Personally I go along with Dale's suggestion to
> use the optical resolution of the map to judge when no further
> improvement occurs. This would need to be done with the completely
> refined structure because presumably optical resolution will be
> reduced by phase errors. Note that it wouldn't be necessary to
> actually quote the optical resolution in place of the X-ray resolution
> (that would confuse everyone!), you just need to know the value of the
> X-ray resolution cut-off where the optical resolution no longer
> changes (it should be clear from a plot of X-ray vs. optical
>> I is measured as a number of detector counts in the reflection minus background counts.
>> sigI is measured as sq. root of I plus standard deviation (SD) for the background plus various deviations from ideal experiment (like noise from satellite crystals).
> The most important contribution to the sigma(I)'s, except maybe for
> the weak reflections, actually comes from differences between the
> intensities of equivalent reflections, due to variations in absorption
> and illuminated volume, and other errors in image scale factors
> (though these are all highly correlated). These are of course exactly
> the same differences that contribute to Rmerge. E.g. in Scala the
> SDFAC & SDADD parameters are automatically adjusted to fit the
> observed QQ plot to the expected one, in order to account for such
>> Obviously, sigI cannot be measured accurately. Moreover, the 'resolution' is related to errors in the structural factors, which are average from several measurements.
>> Errors in their scaling would affect the 'resolution', and <I/sigI> does not detect them, but Rmerge does!
> Sorry you've lost me here, I don't see why <I/sigI> should not detect
> scaling errors: as indicated above if there are errors in the scale
> factors this will inflate the sigma(I) values via increased SDFAC
> and/or SDADD, which will increase the sigma(I) values which will in
> turn reduce the <I/sigma(I)> values exactly as expected. I see no
> difference in the behaviour of Rmerge and <I/sigma(I)> (or indeed in
> CC(1/2)) in this respect, since they all depend on the differences
> between equivalents.
>> Rmerge, it means that the symmetry related reflections did not merge well. Under those conditions, Rmerge becomes a much better criterion for estimation of the 'resolution' than <sigi/I>.
> As indicated above, if the symmetry equivalents don't merge well it
> will increase the sigma(I)'s and reduce <I/sigma(I)>, so in this
> respect I don't see why Rmerge should be any better than <I/sigma(I)>.
> My biggest objection to Rmerge (and this applies also to CC(1/2)) is
> that it involves throwing away valuable information, namely the
> measured sigma(I) values from counting stats. This is not usually a
> good idea (in statistical parlance it reduces the 'power' of the test)
> - and it's not as though one can argue the sigma's are so small that
> they can be neglected (at least not for the weak reflections). Even
> though as you say the estimates of sigma(I) may not be very accurate,
> it seems to me that any estimate is better than no estimate. In any
> case the estimates of sigma(I) are probably quite accurate for the
> weak reflections, it's just for the strong ones that the assumptions
> tend to break down. However if we're estimating resolution from
> <I/sigma(I)> it's only the weak reflections in the outer shell that
> are relevant, so I don't think accuracy of sigma(I) is an issue.
>> If someone decides to use <I/sigI> instead of Rmerge, fine, let it be 2.0.
> As I indicated previously I think 2 is too high, it should be much
> closer to 1 (and again it would appear prudent to err on the side of
> the lower value), because in the outer shell the majority of
> I/sigma(I) values will be < 1 (just from the normal distribution of
> errors). This means that in order to get an average value of
> I/sigma(I) = 2 you need a lot of very significant intensities >> 3.
> The fallacy here lies in comparing the average I/sigma(I) with the
> standard '3 sigma' criterion which is actually appropriate only for a
> single intensity. Of course data anisotropy may well "throw a spanner
> in the works".
>> Alternatively, the resolution could be estimated from the electron density maps.
> I agree, using the optical resolution in the manner indicated above,
> but still quoting the corresponding X-ray resolution for backwards
>> I hope everyone agrees that the resolution should not be dead..
> I completely agree: I say "Long live the resolution!" (sorry I
> couldn't resist it).
> -- Ian