>> Yes, but what I think Frank is trying to point out is that the difference
>> between Fobs and Fcalc in any given PDB entry is generally about 4-5 times
>> larger than sigma(Fobs). In such situations, pretty much any standard
>> statistical test will tell you that the model is "highly unlikely to be
>> correct".
> But that's not the question we are normally asking.
> It is highly unlikely that any model in biology is correct, if by "correct"
> you mean "cannot be improved". Normally we ask the more modest question
> "have I improved my model today over what it was yesterday?".
>
>> I am not saying that everything in the PDB is "wrong", just that the
>> dominant source of error is a shortcoming of the models we use. Whatever
>> this "source of error" is, it vastly overpowers the measurement error. That
>> is, errors do not add linearly, but rather as squares, and 20%^2+5%^2 ~
>> 20%^2 .
>>
>> So, since the experimental error is only a minor contribution to the total
>> error, it is arguably inappropriate to use it as a weight for each hkl.
> I think your logic has run off the track. The experimental error is an
> appropriate weight for the Fobs(hkl) because that is indeed the error
> for that observation. This is true independent of errors in the model.
> If you improve the model, that does not magically change the accuracy
> of the data.
Sorry, still missing something:
In the weighted Rfactor, we're weighting by the 1/sig**2 (right?) And
the reason for that is, presumably, that when we add a term (Fo-Fc) but
the Fo is crap (huge sigma), we need to ensure we don't add very much of
it -- so we divide the term by the huge sigma.
But what if Fc also is crap? Which it patently is: it's not even
within 20% of Fo, never mind vaguely within sig(Fo). Why should we not
be down-weighting those terms as well?
Or can we ignore that because, since all terms are crap, we'd simply be
down-weighting the entire Rw by a lot, and we'd be doing it for the Rw
of both models we're comparing, so they'd cancel out when we take the
ratio Rw1/Rw2?
But if we're so happy to fudge away the huge gorilla in the room, why
would we need to be religious about the little gnats on the floor (the
sig(Fo))? Is there then really a difference between R1/R2 and Rw1/Rw2,
for all practical purposes?
(Of course, this is all for the ongoing case we don't know how to model
the R-factor gap. And no, I haven't played with actual numbers...)
phx.
|