I would just add that another important factor is number of reflections.
Most of the structures we work on have small cell dimensions and high
symmetry space groups. Even at 2.0 Angstroms, there might be only 3000
or so reflections in a dataset. Rfree doesn't always mean much in these
cases, and the R/Rfree differences can be quite large. We started doing
k-fold cross validation, and have more recently started trying out Tim's
implementation of Rcomplete.
--paul
On 10/22/2015 06:53 PM, William G. Scott wrote:
> This will work better if your question (below) has its own thread title.
>
> Briefly, Rfree was invented to detect over-fitting/over-reginement of the data. The absolute value of the Rfree is less significant, and can be dependent upon such things as resolution, the presence of noncrystallographic symmetry, and whether the test set has somehow been contaminated with model bias.
>
> In other words, you should worry about over-fitting if your Rwork goes down and Rfree does not. Having said this, a gap of less than roughly 5% would make me suspicious.
>
>
>
>> On Oct 22, 2015, at 2:31 PM, NISHANT SINGH <[log in to unmask]> wrote:
>>
>> Hi Everyone,
>>
>> So I am working with this complex of a TCR bound to pMHC that was solved at 2.5A. Upon refinement, using Phenix, my stats are
>> R work: 0.1697;
>> R free: 0.2361.
>> Bond RMSD: 0.006,
>> Angles RMSD: 0.963.
>> My clashscore is 3.75,
>> Rama outlier is 0.4
>> and Roto Outlier is 0.4.
>>
>> I am a little worried about over fitting, since the difference between Rfree and Rwork is more than 5%. Should i be worried and if so, is there a way to correct it?
>>
>> Sincerely,
>>
>> Nishant
|