Print

Print


I hope these output files can be helpful to find the problems. I think the completeness is fine. After I realize the big gap, I even cut it to1.98A, but the big gap is still existing there.

Bing
________________________________
From: Ian Tickle [[log in to unmask]]
Sent: Monday, September 07, 2015 9:28 AM
To: Wang, Bing
Subject: Re: [ccp4bb] Big gap between R factor and Rfree even at first run of Refmac


Hi Bing

We need to diagnose the problem before we can fix it!

As I said in my initial reply I suspect low data completeness, which could be the result of an incorrect data collection strategy, a block of poor images, ice rings etc etc.  It's impossible to say without more information from you.  Do you have the log file for the scaling/merging (e.g. AIMLESS)?  That will show the data completeness & other relevant statistics.

Cheers

-- Ian


On 7 September 2015 at 15:16, Wang, Bing <[log in to unmask]<mailto:[log in to unmask]>> wrote:
Any suggestions to fixed it up, according to your description. More details would be appreciated, I am not very good at it.

Thank you very much!

Bing

Sent from my Windows Phone
________________________________
From: Ian Tickle<mailto:[log in to unmask]>
Sent: ý9/ý7/ý2015 8:58 AM
To: [log in to unmask]<mailto:[log in to unmask]>
Subject: Re: [ccp4bb] Big gap between R factor and Rfree even at first run of Refmac


Hi Kay,

In the situation you describe wouldn't the systematic errors that, when fitted by errors in positions and B factors (more likely the latter I suspect) cause Rwork to fall relative to the case where no such fitting occurs (though presumably not relative to the case where there are no such systematic errors in the data in the first place!), also cause Rfree to fall by a similar amount?  Systematic errors would surely be correlated between the working and test sets, assuming of course that the test set has been properly selected, particularly if NCS is present.  I'm not clear what you mean by "specific kinds of systematic errors" that would have such divergent effects on Rwork and Rfree.  Could you elucidate?

The most likely systematic error is of course absorption/illuminated volume which can to some extent be 'soaked-up' by the B factors.  Obviously the case of random error in the data is different: in that case there are 2 effects, one driving both Rwork and Rfree up by about the same amount due to poorer agreement between the model and data, and overfitting (i.e. fitting the parameters to the errors) which in the case of Rwork partially compensates and drives Rwork down (or rather doesn't allow Rwork to rise as much as it would have in the absence of overfitting).  Since the random errors in the test data are uncorrelated with those in the working set, this also has the effect of driving Rfree up relative to the case of no overfitting, i.e. whereas in the case of Rwork the effect of poorer agreement and overftting work in opposite directions, in the case of Rfree they work in the same direction.

Cheers

-- Ian


On 7 September 2015 at 13:07, Kay Diederichs <[log in to unmask]<mailto:[log in to unmask]>> wrote:
There is another main contributor to the Rfree-Rwork gap: systematic errors in your data (random errors raise both Rwork and Rfree). The refined model parameters can "absorb" certain systematic errors to some degree, by jiggling the atom positions and temperature factors a bit. This "model bias" drives Rwork down, and (for specific kinds of systematic errors) leaves Rfree high, thus widens the gap.
I know of no systematic study investigating this, but I've seen it several times.

You could try to process your data more carefully, and/or try a different data processing program.

HTH,

Kay

On Mon, 7 Sep 2015 10:13:12 +0100, Ian Tickle <[log in to unmask]<mailto:[log in to unmask]>> wrote:

>Hi Bing
>
>How complete are your data?  Only ~ 11000 reflexions @ 1.8 Ang. for ~ 2570
>atoms (i.e. 10280 params) doesn't sound very complete to me, which means
>you're relying heavily on the restraints to pull the obs/param ratio up to
>a comfortable value (see B. Rupp book)!  The main factors influencing
>Rfree-Rwork are the o/p ratio (anti-correlated) and Rwork itself
>(positively correlated).  Your Rwork is rather high (0.276) and that alone
>could be enough to explain the difference.  You say you have 'solved' the
>structure.  IMO you haven't solved it until Rwork is below 0.25,
>particularly if o/p ratio is low (see work by Jones & Kleywegt on
>deliberately reversing chain direction & still getting low Rwork!).
>
>Cheers
>
>-- Ian
>
>On 6 September 2015 at 23:07, Wang, Bing <[log in to unmask]<mailto:[log in to unmask]>> wrote:
>
>> Hi guys,
>>
>> I recently got a structure. All other statistics are fine but the R factor
>> and R free has a big gap (around 8%) even at the first run of refmac5.
>> Eventually this gap goes up to 10% after I solved the structure.
>>
>> First I don't think it is due to the space group. Since this protein has
>> been explored very well. It always crystallized in P21. We always use
>> molecular replacement to solve the structure with a little new ligand. The
>> only variable thing is that we soaked a ligand in it. I don't think the
>> soaking can change the space group. Maybe the compound damaged the crystals
>> in somehow, but I can't find the problem. Actually the electron density
>> fits the model very well.
>>
>> Then I have tried "local NCS", "twin refinement" and "weighting term" in
>> refmac5 according to some suggestions posted in CCP4 bulletin board before.
>> None of them can solve this problem very well. Many ranges of the weighting
>> terms have been tried. Lower down the weighting term can smaller the gap,
>> but it also lower down the electron density around my ligand which caused
>> worse Fo-Fc omit map. I also found some of residues even went out the
>> electron density (in other word, they do not fit the electron density map
>> any more) under lower weighting term.
>>
>> Attached are the log files from Phaser-MR and the frist run of Refmac5.
>>
>> Any suggestions?
>>
>> Bing Wang
>>
>>
>