Another issue with these statistics is that the PDB insists on a single value of "resolution" no matter how anisotropic the data. Especially in the outermost bins, Rmerge could be ridiculously high simply because the data only exist in one out of 3 directions.
Phoebe
=====================================
Phoebe A. Rice
Dept. of Biochemistry & Molecular Biology
The University of Chicago
phone 773 834 1723
http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01_Faculty_Alphabetically.php?faculty_id=123
http://www.rsc.org/shop/books/2008/9780854042722.asp
---- Original message ----
>Date: Tue, 26 Oct 2010 09:46:46 -0700
>From: CCP4 bulletin board <[log in to unmask]> (on behalf of "Bernhard Rupp (Hofkristallrat a.D.)" <[log in to unmask]>)
>Subject: [ccp4bb] Against Method (R)
>To: [log in to unmask]
>
>Hi Folks,
>
>Please allow me a few biased reflections/opinions on the numeRology of the
>R-value (not R-factor, because it is neither a factor itself nor does it
>factor in anything but ill-posed reviewer's critique. Historically the term
>originated from small molecule crystallography, but it is only a
>'Residual-value')
>
>a) The R-value itself - based on the linear residuals and of apparent
>intuitive meaning - is statistically peculiar to say the least. I could not
>find it in any common statistics text. So doing proper statistics with R
>becomes difficult.
>
>b) rules of thumb (as much as they conveniently obviate the need for
>detailed explanations, satisfy student's desire for quick answers, and
>allow superficial review of manuscripts) become less valuable if they have a
>case-dependent large variance, topped with an unknown parent distribution.
>Combined with an odd statistic, that has great potential for misguidance and
>unnecessarily lost sleep.
>
>c) Ian has (once again) explained that for example the Rf-R depends on the
>exact knowledge of the restraints and their individual weighting, which we
>generally do not have. Caution is advised.
>
>d) The answer which model is better - which is actually what you want to
>know - becomes a question of model selection or hypothesis testing, which,
>given the obscurity of R cannot be derived with some nice plug-in method. As
>Ian said the models to be compared must also be based on the same and
>identical data.
>
>e) One measure available that is statistically at least defensible is the
>log-likelihood. So what you can do is form a log-likelihood ratio (or Bayes
>factor (there is the darn factor again, it’s a ratio)) and see where this
>falls - and the answers are pretty soft and, probably because of that,
>correspondingly realistic. This also makes - based on statistics alone -
>deciding between different overall parameterizations difficult.
>
>http://en.wikipedia.org/wiki/Bayes_factor
>
>f) so having said that, what really remains is that the model that fits the
>primary evidence (minimally biased electron density) best and is at the same
>time physically meaningful, is the best model, i. e., all plausibly
>accountable electron density (and not more) is modeled. You can convince
>yourself of this by taking the most interesting part of the model out (say a
>ligand or a binding pocket) and look at the R-values or do a model selection
>test - the result will be indecisive. Poof goes the global rule of thumb.
>
>g) in other words: global measures in general are entirely inadequate to
>judge local model quality (noted many times over already by Jones, Kleywegt,
>others, in the dark ages of crystallography when poorly restrained
>crystallographers used to passionately whack each other over the head with
>unfree R-values).
>
>Best, BR
>-----------------------------------------------------------------
>Bernhard Rupp, Hofkristallrat a.D.
>001 (925) 209-7429
>+43 (676) 571-0536
>[log in to unmask]
>[log in to unmask]
>http://www.ruppweb.org/
>------------------------------------------------------------------
>Und wieder ein chillout-mix aus der Hofkristall-lounge
>------------------------------------------------------------------
|