Print

Print


On Fri, Oct 14, 2011 at 12:52 PM, Ed Pozharski <[log in to unmask]> wrote:
The second question is practical.  Let's say I want to deposit the
results of the refinement against the full dataset as my final model.
Should I not report the Rfree and instead insert a remark explaining the
situation?  If I report the Rfree prior to the test set removal, it is
certain that every validation tool will report a mismatch.  It does not
seem that the PDB has a mechanism to deal with this.

You should enter the statistics for the model and data that you actually deposit, not statistics for some other model that you might have had at one point but which the PDB will never see.  Not only does refining against R-free make it impossible to verify and validate your structure, it also means that any time you or anyone else wants to solve an isomorphous structure by MR using your structure as a search model, or continue the refinement with higher-resolution data, you will be starting with a model that has been refined against all reflections.  So any future refinements done with that model against isomorphous data are pre-biased, making your model potentially useless.

I'm amazed that anyone is still depositing structures refined against all data, but the PDB does still get a few.  The benefit of including those extra 5% of data is always minimal in every paper I've seen that reports such a procedure, and far outweighed by having a reliable and relatively unbiased validation statistic that is preserved in the final deposition.  (The situation may be different for very low resolution data, but those structures are a tiny fraction of the PDB.)

-Nat