Here I will disagree. R-free rewards you for putting in atom in density
which an atom belongs in. It doesn't necessarily reward you for putting
the *right* atom in that density, but it does become difficult to do
that under normal circumstances unless you have approximately the right
structure.
However in the case of multi-copy refinement at low resolution, the
refinement is perfectly capable of shoving any old atom in density
corresponding to any other old atom if you give it enough leeway.
Remember that there's a big difference between R-free for a single copy
(45%) and a 16-fold multicopy (38%) in MsbA's P1 form, and almost the
same amount (41% vs 33%) with MsbA's P21 form. (These are E.coli and
V.cholerae respectively). Both single copy and multicopy refinements
were NCS-restrained, as far as I know.
So there's evidence, w/o simulation, that the 12-fold or 16-fold
multicopy refinements are worth 7-8% in R-free, and I'm doubtful that
NCS can generate that sort of gain in either crystal form. I've
certainly never seen that in my own experience at low resolution.
I've been meaning to put online the Powerpoint from the CCP4 talk with
all these numbers in it, but I regret it's sitting on my iBook at home
as of writing.
Phil Jeffrey
Dean Madden wrote:
> It is true that multicopy refinement was essential for the suppression
> of Rwork. However, the whole point of the Rfree is that it is supposed
> to be independent of the number of parameters you're refining. Simply
> throwing multiple copies of the model into the refinement shouldn't have
> affected Rfree, IF IT WERE TRULY "FREE".
>
> It was almost certainly NCS-mediated spillover that allowed the
> multicopy, parameter-driven reduction in Rwork to pull down the Rfree
> values as well. The experiment is probably not worth the time it would
> take to do, but I suspect that if MsbA and EmrE test sets had been
> chosen in thin shells, then Rfree wouldn't have shown nearly the
> "improvement" it did.
>
> Dean
>
>
> Phil Jeffrey wrote:
>> While NCS probably played a role in the first crystal form of MsbA
>> (P1, 8 monomers), this is also the one that showed the greatest
>> improvement in R-free once the structure was correctly redetermined
>> (7% or 14% depending on which refinement protocols you compare).
>>
>> The other crystal form of MsbA and the crystal forms of EmrE didn't
>> have particularly high-copy NCS (2 dimers, 4 monomers, dimer, 2
>> tetramers) and the R-frees were somewhat comparable in all cases
>> (31-36% for the redetermined structures).
>>
>> The *major* source of the R-free suppression in all these cases with
>> the inappropriate use of multi-copy refinement at low resolution.
>>
>> Phil Jeffrey
>> Princeton
>>
>>
>> Dean Madden wrote:
>>> Hi Dirk,
>>>
>>> I disagree with your final sentence. Even if you don't apply NCS
>>> restraints/constraints during refinement, there is a serious risk of
>>> NCS "contaminating" your Rfree. Consider the limiting case in which
>>> the "NCS" is produced simply by working in an artificially low
>>> symmetry space-group (e.g. P1, when the true symmetry is P2): in this
>>> case, putting one symmetry mate in the Rfree set, and one in the
>>> Rwork set will guarantee that Rfree tracks Rwork. The same effect
>>> applies to a large extent even if the NCS is not crystallographic.
>>>
>>> Bottom line: thin shells are not a perfect solution, but if NCS is
>>> present, choosing the free set randomly is *never* a better choice,
>>> and almost always significantly worse. Together with multicopy
>>> refinement, randomly chosen test sets were almost certainly a major
>>> contributor to the spuriously good Rfree values associated with the
>>> retracted MsbA and EmrE structures.
>>>
>>> Best wishes,
>>> Dean
>>>
>>> Dirk Kostrewa wrote:
>>>> Dear CCP4ers,
>>>>
>>>> I'm not convinced, that thin shells are sufficient: I think, in
>>>> principle, one should omit thick shells (greater than the diameter
>>>> of the G-function of the molecule/assembly that is used to describe
>>>> NCS-interactions in reciprocal space), and use the inner thin layer
>>>> of these thick shells, because only those should be completely
>>>> independent of any working set reflections. But this would be too
>>>> "expensive" given the low number of observed reflections that one
>>>> usually has ...
>>>> However, if you don't apply NCS restraints/constraints, there is no
>>>> need for any such precautions.
>>>>
>>>> Best regards,
>>>>
>>>> Dirk.
>>>>
>>>> Am 07.02.2008 um 16:35 schrieb Doug Ohlendorf:
>>>>
>>>>> It is important when using NCS that the Rfree reflections be
>>>>> selected is
>>>>> distributed thin resolution shells. That way application of NCS
>>>>> should not
>>>>> mix Rwork and Rfree sets. Normal random selection or Rfree + NCS
>>>>> (especially 4x or higher) will drive Rfree down unfairly.
>>>>>
>>>>> Doug Ohlendorf
>>>>>
>>>>> -----Original Message-----
>>>>> From: CCP4 bulletin board [mailto:[log in to unmask]] On Behalf Of
>>>>> Eleanor Dodson
>>>>> Sent: Tuesday, February 05, 2008 3:38 AM
>>>>> To: [log in to unmask] <mailto:[log in to unmask]>
>>>>> Subject: Re: [ccp4bb] an over refined structure
>>>>>
>>>>> I agree that the difference in Rwork to Rfree is quite acceptable
>>>>> at your resolution. You cannot/ should not use Rfactors as a
>>>>> criteria for structure correctness.
>>>>> As Ian points out - choosing a different Rfree set of reflections
>>>>> can change Rfree a good deal.
>>>>> certain NCS operators can relate reflections exactly making it hard
>>>>> to get a truly independent Free R set, and there are other reasons
>>>>> to make it a blunt edged tool.
>>>>>
>>>>> The map is the best validator - are there blobs still not fitted?
>>>>> (maybe side chains you have placed wrongly..) Are there many
>>>>> positive or negative peaks in the difference map? How well does the
>>>>> NCS match the 2 molecules?
>>>>>
>>>>> etc etc.
>>>>> Eleanor
>>>>>
>>>>> George M. Sheldrick wrote:
>>>>>> Dear Sun,
>>>>>>
>>>>>> If we take Ian's formula for the ratio of R(free) to R(work) from
>>>>>> his paper Acta D56 (2000) 442-450 and make some reasonable
>>>>>> approximations,
>>>>>> we can reformulate it as:
>>>>>>
>>>>>> R(free)/R(work) = sqrt[(1+Q)/(1-Q)] with Q = 0.025pd^3(1-s)
>>>>>>
>>>>>> where s is the fractional solvent content, d is the resolution, p is
>>>>>> the effective number of parameters refined per atom after allowing
>>>>>> for
>>>>>> the restraints applied, d^3 means d cubed and sqrt means square root.
>>>>>>
>>>>>> The difficult number to estimate is p. It would be 4 for an
>>>>>> isotropic refinement without any restraints. I guess that p=1.5
>>>>>> might be an appropriate value for a typical protein refinement
>>>>>> (giving an R-factor
>>>>>> ratio of about 1.4 for s=0.6 and d=2.8). In that case, your
>>>>>> R-factor ratio of 0.277/0.215 = 1.29 is well within the allowed
>>>>>> range!
>>>>>>
>>>>>> However it should be added that this formula is almost a
>>>>>> self-fulfilling prophesy. If we relax the geometric restraints we
>>>>>> increase p, which then leads to a larger 'allowed' R-factor ratio!
>>>>>>
>>>>>> Best wishes, George
>>>>>>
>>>>>>
>>>>>> Prof. George M. Sheldrick FRS
>>>>>> Dept. Structural Chemistry,
>>>>>> University of Goettingen,
>>>>>> Tammannstr. 4,
>>>>>> D37077 Goettingen, Germany
>>>>>> Tel. +49-551-39-3021 or -3068
>>>>>> Fax. +49-551-39-2582
>>>>>>
>>>>>>
>>>>>>
>>>>
>>>>
>>>> *******************************************************
>>>> Dirk Kostrewa
>>>> Gene Center, A 5.07
>>>> Ludwig-Maximilians-University
>>>> Feodor-Lynen-Str. 25
>>>> 81377 Munich
>>>> Germany
>>>> Phone: +49-89-2180-76845
>>>> Fax: +49-89-2180-76999
>>>> E-mail: [log in to unmask]
>>>> <mailto:[log in to unmask]>
>>>> *******************************************************
>>>>
>>>>
>>>
>>
>
|