Print

Print



If I've understood correctly what you're saying, you seem to be implying that if 'over-refining' (whatever that means) at some point produces an increase in Rfree along the trajectory then one should stop refining at that point.  However this is tantamount to using Rfree as a convergence criterion!  The whole point of Rfree surely is that it has absolutely no say in the decision when to stop refinement (otherwise it's no longer 'free').

The target of refinement is the maximum of the likelihood, not any of the R values (clue: it's called 'maximum likelihood refinement', not 'R-value refinement').  So it's actually completely irrelevant whether the R values go up or down during the refinement, since they are not the functions being optimised.  Since the functional dependence of the R values on the parameters is quite different from that of the likelihood (for one thing the conventional R values take no account of the weighting), there's absolutely no reason why the R values should slavishly follow the likelihood in its upward trend.

So the only thing that matters is the R values at convergence.  Convergence as well as optimal agreement with the model is achieved when the likelihood is a maximum under the given starting conditions, and if you were to monitor the likelihood you would see it increasing monotonically: the optimiser does not permit a downward step!  So 'over-refinement' by the definition I have understood is impossible! (or maybe I have misunderstood the definition: perhaps someone could define what they mean by 'over-refinement'?).

This of course doesn't mean that if you had started with different conditions (e.g. a different model, parameterisation, weighting scheme, etc.) then you mightn't have obtained better agreement between the model & data at convergence.  If under different starting conditions you obtain a higher Rfree at the maximum of the likelihood then it's likely you have overfitted (not over-refined), i.e. you are fitting the parameters of the model to some degree to random errors in the data.  Overfitting is determined by the starting conditions (mainly the observation / effective parameter ratio), not by the conditions at convergence.  The Rfree value at maximum likelihood convergence is a measure of the overfitting that was already inherent in the starting conditions: if you stop refinement before convergence Rfree will most likely not give you a true measure of the degree of overfitting.

Cheers

-- Ian


On 27 April 2016 at 00:33, James Phillips <[log in to unmask]> wrote:
It is not where the R/Rfree "end up" it is the trajectory through the refinement process. As you refine then adjust your model the two should drop. If R drops  Rfree stays the same or rises you have over refined. I suggest giving the journal the results of each stage of your refinement.


On Tuesday, April 26, 2016, Professor James Henderson Naismith <[log in to unmask]> wrote:
Dear Colleagues.
We are having difficulty persuading a reviewer that our structure is not over refined.

The structure is a molecular replacement of complex with a published relatively non-isomorphous native structure from another lab.

The same Rfree set was used as the published data.

Our complex is at 2.6A and R/Rfree end up at 18/22

PDB redo gets the same result, so does phenix.refine (with a trivial %). All B-factors were reset and TLS used.

The data are 2.61A and average B is 80A, there are 4500 residues, 68 waters. Unfortunately Mol probity gives us 100th centile and the Rama is also good, bond rms is 0.012 and we used NCS local restraints.

There is no rotational NCS but there is a weak translation symmetry (does not show up in data but when refined the monomers have a bead on a curved string appearance).

The referee has refused accept the paper until we make the R and Rfree higher by some undefined target, since it is 'over refined'

Does anyone have a useful program to make structures worse to some threshold that is considered normal at 2.6 A or does anyone know a good paper that points out Rfree is not susceptible to over refinement since by definition it is not refined.

best
Jim




Jim Naismith
St Andrews

The University of St Andrews is a charity registered in Scotland : No SC013532