Hi Jim,
The typical problem with overfitting is that the gap or ratio between R and
R-free becomes ridiculously great. This is clearly not the case with your
numbers. The R values are low, good on ya. Noting to worry about. PDB_REDO
gives boxplots with your R-free value against all PDB_REDO and PDB entries
of similar resolution. Is yours an outlier?
Some stats at your resolution (2.56-2.66A, 3610 entries, unfiltered)
PDB R=21.0+-2.4 Free=26.0+2.6
REDO R=20.6+-2.7 Free=24.7+3.2
So your R-factor gap is completely normal (no overfitting) and your R-values
are less than 1 sigma below the average.
Previous discussions on the bulletin board explained that keeping your test
set with non-isomorphous crystals is not needed and doesn't help, but it
doesn't do any harm either. Other signs of overfitting could be a
suspiciously higher average B-factor for your non-protein stuff (if you
added waters too enthusiastically), but that usually come with a high
R-factor gap.
It's hard to take referees seriously if they are guided by some undefined
rule of thumb. But obviously, that doesn't solve your problem. I guess you
have to write a convincing rebuttal explaining why your stats are better
than average, but not suspicious. Good luck staying polite.
Cheers,
Robbie
> -----Original Message-----
> From: CCP4 bulletin board [mailto:[log in to unmask]] On Behalf Of
> Professor James Henderson Naismith
> Sent: Tuesday, April 26, 2016 22:33
> To: [log in to unmask]
> Subject: [ccp4bb] Over refinement
>
> Dear Colleagues.
> We are having difficulty persuading a reviewer that our structure is not
over
> refined.
>
> The structure is a molecular replacement of complex with a published
> relatively non-isomorphous native structure from another lab.
>
> The same Rfree set was used as the published data.
>
> Our complex is at 2.6A and R/Rfree end up at 18/22
>
> PDB redo gets the same result, so does phenix.refine (with a trivial %).
All B-
> factors were reset and TLS used.
>
> The data are 2.61A and average B is 80A, there are 4500 residues, 68
waters.
> Unfortunately Mol probity gives us 100th centile and the Rama is also
good,
> bond rms is 0.012 and we used NCS local restraints.
>
> There is no rotational NCS but there is a weak translation symmetry (does
> not show up in data but when refined the monomers have a bead on a
> curved string appearance).
>
> The referee has refused accept the paper until we make the R and Rfree
> higher by some undefined target, since it is 'over refined'
>
> Does anyone have a useful program to make structures worse to some
> threshold that is considered normal at 2.6 A or does anyone know a good
> paper that points out Rfree is not susceptible to over refinement since by
> definition it is not refined.
>
> best
> Jim
>
>
>
>
> Jim Naismith
> St Andrews
>
> The University of St Andrews is a charity registered in Scotland : No
> SC013532
|