Print

Print


The difference between one and the correlation coefficient is a square
function of differences between the datapoints. So rather large 6%
relative error with 8-fold data multiplicity (redundancy) can lead to
CC1/2 values about 99.9%.
It is just the nature of correlation coefficients.

Zbyszek Otwinowski



> Related to this, I've always wondered what CC1/2 values mean for low
> resolution. Not being mathematically inclined, I'm sure this is a naive
> question, but i'll ask anyway - what does CC1/2=100 (or 99.9) mean?
> Does it mean the data is as good as it gets?
>
> Alan
>
>
>
> On 07/12/2012 17:15, Douglas Theobald wrote:
>> Hi Boaz,
>>
>> I read the K&K paper as primarily a justification for including
>> extremely weak data in refinement (and of course introducing a new
>> single statistic that can judge data *and* model quality comparably).
>> Using CC1/2 to gauge resolution seems like a good option, but I never
>> got from the paper exactly how to do that.  The resolution bin where
>> CC1/2=0.5 seems natural, but in my (limited) experience that gives
>> almost the same answer as I/sigI=2 (see also K&K fig 3).
>>
>>
>>
>> On Dec 7, 2012, at 6:21 AM, Boaz Shaanan <[log in to unmask]>
>> wrote:
>>
>>> Hi,
>>>
>>> I'm sure Kay will have something to say  about this but I think the
>>> idea of the K & K paper was to introduce new (more objective) standards
>>> for deciding on the resolution, so I don't see why another table is
>>> needed.
>>>
>>> Cheers,
>>>
>>>
>>>
>>>
>>>            Boaz
>>>
>>>
>>> Boaz Shaanan, Ph.D.
>>> Dept. of Life Sciences
>>> Ben-Gurion University of the Negev
>>> Beer-Sheva 84105
>>> Israel
>>>
>>> E-mail: [log in to unmask]
>>> Phone: 972-8-647-2220  Skype: boaz.shaanan
>>> Fax:   972-8-647-2992 or 972-8-646-1710
>>>
>>>
>>>
>>>
>>>
>>> ________________________________________
>>> From: CCP4 bulletin board [[log in to unmask]] on behalf of Douglas
>>> Theobald [[log in to unmask]]
>>> Sent: Friday, December 07, 2012 1:05 AM
>>> To: [log in to unmask]
>>> Subject: [ccp4bb] refining against weak data and Table I stats
>>>
>>> Hello all,
>>>
>>> I've followed with interest the discussions here about how we should be
>>> refining against weak data, e.g. data with I/sigI << 2 (perhaps using
>>> all bins that have a "significant" CC1/2 per Karplus and Diederichs
>>> 2012).  This all makes statistical sense to me, but now I am wondering
>>> how I should report data and model stats in Table I.
>>>
>>> Here's what I've come up with: report two Table I's.  For comparability
>>> to legacy structure stats, report a "classic" Table I, where I call the
>>> resolution whatever bin I/sigI=2.  Use that as my "high res" bin, with
>>> high res bin stats reported in parentheses after global stats.   Then
>>> have another Table (maybe Table I* in supplementary material?) where I
>>> report stats for the whole dataset, including the weak data I used in
>>> refinement.  In both tables report CC1/2 and Rmeas.
>>>
>>> This way, I don't redefine the (mostly) conventional usage of
>>> "resolution", my Table I can be compared to precedent, I report stats
>>> for all the data and for the model against all data, and I take
>>> advantage of the information in the weak data during refinement.
>>>
>>> Thoughts?
>>>
>>> Douglas
>>>
>>>
>>> ^`^`^`^`^`^`^`^`^`^`^`^`^`^`^`^`^`^`^`^`
>>> Douglas L. Theobald
>>> Assistant Professor
>>> Department of Biochemistry
>>> Brandeis University
>>> Waltham, MA  02454-9110
>>>
>>> [log in to unmask]
>>> http://theobald.brandeis.edu/
>>>
>>>             ^\
>>>   /`  /^.  / /\
>>> / / /`/  / . /`
>>> / /  '   '
>>> '
>>>
>>
>>
>
> --
> Alan Cheung
> Gene Center
> Ludwig-Maximilians-University
> Feodor-Lynen-Str. 25
> 81377 Munich
> Germany
> Phone:  +49-89-2180-76845
> Fax:  +49-89-2180-76999
> E-mail: [log in to unmask]
>