JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for CCP4BB Archives


CCP4BB Archives

CCP4BB Archives


CCP4BB@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

CCP4BB Home

CCP4BB Home

CCP4BB  July 2017

CCP4BB July 2017

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Re: AW: [ccp4bb] Rmergicide Through Programming

From:

Peter Keller <[log in to unmask]>

Reply-To:

Peter Keller <[log in to unmask]>

Date:

Mon, 10 Jul 2017 16:54:50 +0100

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (459 lines)

On 10/07/17 16:42, Phil Evans wrote:
> What is the difference between Rmerge and Rsym - I thought they were the same?
> Rrim == Rmeas I think

The descriptions and formulae for most of these R values as used by the 
PDB can be found in the pdbx exchange dictionary, in the REFLNS_SHELL 
category.

(1) Rrim has:

>  The redundancy-independent merging R factor value Rrim, also denoted Rmeas, for merging all intensities in a given shell. 

see 
<http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Items/_reflns_shell.pdbx_Rrim_I_all.html>) 
so Phil is right and they are the same.

(2) Rsym has just:

> R sym value in percent.

see 
<http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Items/_reflns_shell.pdbx_Rsym_value.html>. 
To be honest, if that is the official definition it does make me wonder 
what it is doing in the dictionary at all.....

Regards,
Peter.

> 
> Phil
> 
> 
> 
>> On 10 Jul 2017, at 15:18, John Berrisford <[log in to unmask]> wrote:
>>
>> Dear Herman
>>
>> The new PDB deposition system (OneDep) allows you to enter values for Rmerge, Rsym, Rpim, Rrim and / or CC half. If, during deposition, you do not provide a value for any of these metrics then we will ask you for a value for one of them.
>>
>> Also, PDB format is a legacy format for the PDB. In 2014 mmCIF became the archive format for the PDB and some large entries are no longer distributed in PDB format. mmCIF is not limited by the constraints of punch cards.
>>
>> Please see https://www.wwpdb.org/documentation/file-formats-and-the-pdb
>>
>> Regards
>>
>> John
>>
>> PDBe
>>
>>
>>
>> On 10/07/2017 09:26, [log in to unmask] wrote:
>>> Dear All,
>>>
>>> For me this whole discussion is an example of a large number of people barking at the wrong tree. The real issue is not whether data processing programs print amongst many quality indicators an Rmerge as well, but the fact that the PDB and many journals still insist on using the Rmerge as primary quality indicator. As long as this is true, novice scientist might be led to believe that Rmerge is the most important quality indicator. As soon as the PDB and the journals request some other indicator, this will be over. So that is where we should direct our efforts to.
>>>
>>> I don't understand at all, why the PDB still insists on an obsolete quality indicator. However, the PDB format for the coordinates also dates back to the 1960's to be used with punch cards.
>>>
>>> My 2 cents.
>>> Herman
>>>
>>>
>>>
>>> -----Ursprüngliche Nachricht-----
>>> Von: CCP4 bulletin board [mailto:[log in to unmask]] Im Auftrag von Edward A. Berry
>>> Gesendet: Samstag, 8. Juli 2017 22:31
>>> An: [log in to unmask]
>>> Betreff: Re: [ccp4bb] Rmergicide Through Programming
>>>
>>> But R-merge is not really narrower as a fraction of the mean value- it just gets smaller proportionantly as all the numbers get smaller:
>>> RMSD of .0043 for R-meas multiplied by factor of 0.022/.027 gives 0.0035 which is the RMSD for Rmerge. The same was true in the previous example. You could multiply R-meas by .5 or .2 and get a sharper distribution yet! And that factor would be constant, where this only applies for super-low redundancy.
>>>
>>> On 07/08/2017 03:23 PM, James Holton wrote:
>>>> The expected distribution of Rmeas values is still wider than that of Rmerge for data with I/sigma=30 and average multiplicity=2.0. Graph attached.
>>>>
>>>> I expect that anytime you incorporate more than one source of information you run the risk of a noisier statistic because every source of information can contain noise.  That is, Rmeas combines information about multiplicity with the absolute deviates in the data to form a statistic that is more accurate that Rmerge, but also (potentially) less precise.
>>>>
>>>> Perhaps that is what we are debating here?  Which is better? accuracy or precision?  Personally, I prefer to know both.
>>>>
>>>> -James Holton
>>>> MAD Scientist
>>>>
>>>> On 7/8/2017 11:02 AM, Frank von Delft wrote:
>>>>> It is quite easy to end up with low multiplicities in the low resolution shell, especially for low symmetry and fast-decaying crystals.
>>>>>
>>>>> It is this scenario where Rmerge (lowres) is more misleading than Reas.
>>>>>
>>>>> phx
>>>>>
>>>>>
>>>>> On 08/07/2017 17:31, James Holton wrote:
>>>>>> What does Rmeas tell us that Rmerge doesn't?  Given that we know the multiplicity?
>>>>>>
>>>>>> -James Holton
>>>>>> MAD Scientist
>>>>>>
>>>>>> On 7/8/2017 9:15 AM, Frank von Delft wrote:
>>>>>>> Anyway, back to reality:  does anybody still use R statistics to evaluate anything other than /strong/ data?  Certainly I never look at it except for the low-resolution bin (or strongest reflections). Specifically, a "2%-dataset" in that bin is probably healthy, while a "9%-dataset" probably Has Issues.
>>>>>>>
>>>>>>> In which case, back to Jacob's question:  what does Rmerge tell us that Rmeas doesn't.
>>>>>>>
>>>>>>> phx
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 08/07/2017 17:02, James Holton wrote:
>>>>>>>> Sorry for the confusion.  I was going for brevity!  And failed.
>>>>>>>>
>>>>>>>> I know that the multiplicity correction is applied on a per-hkl basis in the calculation of Rmeas.  However, the average multiplicity over the whole calculation is most likely not an integer. Some hkls may be observed twice while others only once, or perhaps 3-4 times in the same scaling run.
>>>>>>>>
>>>>>>>> Allow me to do the error propagation properly.  Consider the scenario:
>>>>>>>>
>>>>>>>> Your outer resolution bin has a true I/sigma = 1.00 and average multiplicity of 2.0. Let's say there are 100 hkl indices in this bin.  I choose the "true" intensities of each hkl from an exponential (aka Wilson) distribution. Further assume the background is high, so the error in each observation after background subtraction may be taken from a Gaussian distribution. Let's further choose the per-hkl multiplicity from a Poisson distribution with expectation value 2.0, so 0 is possible, but the long-term average multiplicity is 2.0. For R calculation, when multiplicity of any given hkl is less than 2 it is skipped. What I end up with after 120,000 trials is a distribution of values for each R factor.  See attached graph.
>>>>>>>>
>>>>>>>> What I hope is readily apparent is that the distribution of Rmerge
>>>>>>>> values is taller and sharper than that of the Rmeas values.  The most likely Rmeas is 80% and that of Rmerge is 64.6%.  This is expected, of course.  But what I hope to impress upon you is that the most likely value is not generally the one that you will get! The distribution has a width.  Specifically, Rmeas could be as low as 40%, or as high as 209%, depending on the trial.  Half of the trial results falling between 71.4% and 90.3%, a range of 19 percentage points.  Rmerge has a middle-half range from 57.6% to 72.9% (15.3 percentage points).  This range of possible values of Rmerge or Rmeas from data with the same intrinsic quality is what I mean when I say "numerical instability".  Each and every trial had the same true I/sigma and multiplicity, and yet the R factors I get vary depending on the trial.  Unfortunately for most of us with real data, you only ever get one trial, and you can't predict which Rmeas or Rmerge you'll get.
>>>>>>>>
>>>>>>>> My point here is that R statistics in general are not comparable from experiment to experiment when you are looking at data with low average intensity and low multiplicity, and it appears that Rmeas is less stable than Rmerge.  Not by much, mind you, but still jumps around more.
>>>>>>>>
>>>>>>>> Hope that is clearer?
>>>>>>>>
>>>>>>>> Note that in no way am I suggesting that low-multiplicity is the right way to collect data.  Far from it.  Especially with modern detectors that have negligible read-out noise. But when micro crystals only give off a handful of photons each before they die, low multiplicity might be all you have.
>>>>>>>>
>>>>>>>> -James Holton
>>>>>>>> MAD Scientist
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 7/7/2017 2:33 PM, Edward A. Berry wrote:
>>>>>>>>> I think the confusion here is that the "multiplicity correction"
>>>>>>>>> is applied on each reflection, where it will be an integer 2 or
>>>>>>>>> greater (can't estimate variance with only one measurement). You
>>>>>>>>> can only correct in an approximate way using using the average
>>>>>>>>> multiplicity of the dataset, since it would depend on the distribution of multiplicity over the reflections.
>>>>>>>>>
>>>>>>>>> And the correction is for r-merge. You don't need to apply a
>>>>>>>>> correction to R-meas.
>>>>>>>>> R-meas is a redundancy-independent best estimate of the variance.
>>>>>>>>> Whatever you would have used R-merge for (hopefully taking
>>>>>>>>> allowance for the multiplicity) you can use R-meas and not worry about multiplicity.
>>>>>>>>> Again, what information does R-merge provide that R-meas does not
>>>>>>>>> provide in a more accurate way?
>>>>>>>>>
>>>>>>>>> According to the denso manual, one way to artificially reduce
>>>>>>>>> R-merge is to include reflections with only one measure
>>>>>>>>> (averaging in a lot of zero's always helps bring an average
>>>>>>>>> down), and they say there were actually some programs that did
>>>>>>>>> that. However I'm quite sure none of the ones we rely on today do that.
>>>>>>>>>
>>>>>>>>> On 07/07/2017 03:12 PM, Kay Diederichs wrote:
>>>>>>>>>> James,
>>>>>>>>>>
>>>>>>>>>> I cannot follow you. "n approaches 1" can only mean n = 2 because n is integer. And for n=2 the sqrt(n/(n-1)) factor is well-defined. For n=1, neither contributions to Rmeas nor Rmerge nor to any other precision indicator can be calculated anyway, because there's nothing this measurement can be compared against.
>>>>>>>>>>
>>>>>>>>>> just my 2 cents,
>>>>>>>>>>
>>>>>>>>>> Kay
>>>>>>>>>>
>>>>>>>>>> On Fri, 7 Jul 2017 10:57:17 -0700, James Holton <[log in to unmask]> wrote:
>>>>>>>>>>
>>>>>>>>>>> I happen to be one of those people who think Rmerge is a very
>>>>>>>>>>> useful statistic.  Not as a method of evaluating the resolution
>>>>>>>>>>> limit, which is mathematically ridiculous, but for a host of
>>>>>>>>>>> other important things, like evaluating the performance of data
>>>>>>>>>>> collection equipment, and evaluating the isomorphism of different crystals, to name a few.
>>>>>>>>>>>
>>>>>>>>>>> I like Rmerge because it is a simple statistic that has a
>>>>>>>>>>> simple formula and has not undergone any "corrections".
>>>>>>>>>>> Corrections increase complexity, and complexity opens the door
>>>>>>>>>>> to manipulation by the desperate and/or misguided.  For
>>>>>>>>>>> example, overzealous outlier rejection is a common way to abuse
>>>>>>>>>>> R factors, and it is far too often swept under the rug,
>>>>>>>>>>> sometimes without the user even knowing about it. This is
>>>>>>>>>>> especially problematic when working in a regime where the statistic of interest is unstable, and for R factors this is low intensity data.
>>>>>>>>>>> Rejecting just the right "outliers" can make any R factor look
>>>>>>>>>>> a lot better.  Why would Rmeas be any more unstable than
>>>>>>>>>>> Rmerge? Look at the formula. There is an "n-1" in the
>>>>>>>>>>> denominator, where n is the multiplicity.  So, what happens
>>>>>>>>>>> when n approaches 1 ? What happens when n=1? This is not to say
>>>>>>>>>>> Rmerge is better than Rmeas. In fact, I believe the latter is
>>>>>>>>>>> generally superior to the first, unless you are working near n
>>>>>>>>>>> = 1. The sqrt(n/(n-1)) is trying to correct for bias in the R
>>>>>>>>>>> statistic, but fighting one infinity with another infinity is a dangerous game.
>>>>>>>>>>>
>>>>>>>>>>> My point is that neither Rmerge nor Rmeas are easily
>>>>>>>>>>> interpreted without knowing the multiplicity.  If you see Rmeas
>>>>>>>>>>> = 10% and the multiplicity is 10, then you know what that
>>>>>>>>>>> means.  Same for Rmerge, since at n=10 both stats have nearly
>>>>>>>>>>> the same value.  But if you have Rmeas = 45% and multiplicity =
>>>>>>>>>>> 1.05, what does that mean?  Rmeas will be only 33% if the
>>>>>>>>>>> multiplicity is rounded up to 1.1. This is what I mean by
>>>>>>>>>>> "numerical instability", the value of the R statistic itself
>>>>>>>>>>> becomes sensitive to small amounts of noise, and behaves more
>>>>>>>>>>> and more like a random number generator. And if you have Rmeas
>>>>>>>>>>> = 33% and no indication of multiplicity, it is hard to know
>>>>>>>>>>> what is going on.  I personally am a lot more comfortable
>>>>>>>>>>> seeing qualitative agreement between Rmerge and Rmeas, because that means the numerical instability of the multiplicity correction didn't mess anything up.
>>>>>>>>>>>
>>>>>>>>>>> Of course, when the intensity is weak R statistics in general
>>>>>>>>>>> are not useful.  Both Rmeas and Rmerge have the sum of all
>>>>>>>>>>> intensities in the denominator, so when the bin-wide sum
>>>>>>>>>>> approaches zero you have another infinity to contend with.
>>>>>>>>>>> This one starts to rear its ugly head once I/sigma drops below
>>>>>>>>>>> about 3, and this is why our ancestors always applied a sigma
>>>>>>>>>>> cutoff before computing an R factor. Our small-molecule
>>>>>>>>>>> colleagues still do this!  They call it "R1".  And it is an
>>>>>>>>>>> excellent indicator of the overall relative error.  The
>>>>>>>>>>> relative error in the outermost bin is not meaningful, and strangely enough nobody ever reported the outer-resolution Rmerge before 1995.
>>>>>>>>>>>
>>>>>>>>>>> For weak signals, Correlation Coefficients are better, but for
>>>>>>>>>>> strong signals CC pegs out at >95%, making it harder to see relative errors.
>>>>>>>>>>> I/sigma is what we'd like to know, but the value of "sigma" is
>>>>>>>>>>> still prone to manipulation by not just outlier rejection, but
>>>>>>>>>>> massaging the so-called "error model".  Suffice it to say,
>>>>>>>>>>> crystallographic data contain more than one type of error.
>>>>>>>>>>> Some sources are important for weak spots, others are important
>>>>>>>>>>> for strong spots, and still others are only apparent in the
>>>>>>>>>>> mid-range.  Some sources of error are only important at low
>>>>>>>>>>> multiplicity, and others only manifest at high multiplicity.
>>>>>>>>>>> There is no single number that can be used to evaluate all aspects of data quality.
>>>>>>>>>>>
>>>>>>>>>>> So, I remain a champion of reporting Rmerge.  Not in the
>>>>>>>>>>> high-angle bin, because that is essentially a random number,
>>>>>>>>>>> but overall Rmerge and low-angle-bin Rmerge next to
>>>>>>>>>>> multiplicity, Rmeas, CC1/2 and other statistics is the only way
>>>>>>>>>>> you can glean enough information about where the errors are
>>>>>>>>>>> coming from in the data.  Rmeas is a useful addition because it
>>>>>>>>>>> helps us correct for multiplicity without having to do math in
>>>>>>>>>>> our head.  Users generally thank you for that. Rmerge, however,
>>>>>>>>>>> has served us well for more than half a century, and I believe
>>>>>>>>>>> Uli Arndt knew what he was doing.  I hope we all know enough
>>>>>>>>>>> about history to realize that future generations seldom thank their ancestors for "protecting" them from information.
>>>>>>>>>>>
>>>>>>>>>>> -James Holton
>>>>>>>>>>> MAD Scientist
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 7/5/2017 10:36 AM, Graeme Winter wrote:
>>>>>>>>>>>> Frank,
>>>>>>>>>>>>
>>>>>>>>>>>> you are asking me to remove features that I like, so I would feel that the challenge is for you to prove that this is harmful however:
>>>>>>>>>>>>
>>>>>>>>>>>>     - at the minimum, I find it a useful check sum that the stats are internally consistent (though I interpret it for lots of other reasons too)
>>>>>>>>>>>>     - it is faulty I agree, but (with caveats) still useful
>>>>>>>>>>>> IMHO
>>>>>>>>>>>>
>>>>>>>>>>>> Sorry for being terse, but I remain to be convinced that
>>>>>>>>>>>> removing it increases the amount of information
>>>>>>>>>>>>
>>>>>>>>>>>> CC’ing BB as requested
>>>>>>>>>>>>
>>>>>>>>>>>> Best wishes Graeme
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> On 5 Jul 2017, at 17:17, Frank von Delft <[log in to unmask]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> You keep not answering the challenge.
>>>>>>>>>>>>>
>>>>>>>>>>>>> It's really simple:  what information does Rmerge provide that Rmeas doesn't.
>>>>>>>>>>>>>
>>>>>>>>>>>>> (If you answer, email to the BB.)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 05/07/2017 16:04, [log in to unmask] wrote:
>>>>>>>>>>>>>> Dear Frank,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> You are forcefully arguing essentially that others are wrong if we feel an existing statistic continues to be useful, and instead insist that it be outlawed so that we may not make use of it, just in case someone misinterprets it.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Very well
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I do however express disquiet that we as software developers feel browbeaten to remove the output we find useful because “the community” feel that it is obsolete.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I feel that Jacob’s short story on this thread illustrates that educating the next generation of crystallographers to understand what all of the numbers mean is critical, and that a numerological approach of trying to optimise any one statistic is essentially doomed. Precisely the same argument could be made for people cutting the “resolution” at the wrong place in order to improve the average I/sig(I) of the data set.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Denying access to information is not a solution to misinterpretation, from where I am sat, however I acknowledge that other points of view exist.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best wishes Graeme
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 5 Jul 2017, at 12:11, Frank von Delft <[log in to unmask]<mailto:[log in to unmask]>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Graeme, Andrew
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Jacob is not arguing against an R-based statistic;  he's pointing out that leaving out the multiplicity-weighting is prehistoric (Diederichs & Karplus published it 20 years ago!).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So indeed:   Rmerge, Rpim and I/sigI give different information.  As you say.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> But no:   Rmerge and Rmeas and Rcryst do NOT give different information.  Except:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>      * Rmerge is a (potentially) misleading version of Rmeas.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>      * Rcryst and Rmerge and Rsym are terms that no longer have significance in the single cryo-dataset world.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> phx.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 05/07/2017 09:43, Andrew Leslie wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I would like to support Graeme in his wish to retain Rmerge in Table 1, essentially for exactly the same reasons.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I also strongly support Francis Reyes comment about the usefulness of Rmerge at low resolution, and I would add to his list that it can also, in some circumstances, be more indicative of the wrong choice of symmetry (too high) than the statistics that come from POINTLESS (excellent though that program is!).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Andrew
>>>>>>>>>>>>>> On 5 Jul 2017, at 05:44, Graeme Winter <[log in to unmask]<mailto:[log in to unmask]>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> HI Jacob
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yes, I got this - and I appreciate the benefit of Rmeas for dealing with measuring agreement for small-multiplicity observations. Having this *as well* is very useful and I agree Rmeas / Rpim / CC-half should be the primary “quality” statistics.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> However, you asked if there is any reason to *keep* rather
>>>>>>>>>>>>>> than *eliminate* Rmerge, and I offered one :o)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I do not see what harm there is reporting Rmerge, even if it is just used in the inner shell or just used to capture a flavour of the data set overall. I also appreciate that Rmeas converges to the same value for large multiplicity i.e.:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Overall InnerShell  OuterShell
>>>>>>>>>>>>>> Low resolution limit                       39.02 39.02      1.39
>>>>>>>>>>>>>> High resolution limit                       1.35 6.04      1.35
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Rmerge  (within I+/I-)                     0.080 0.057     2.871
>>>>>>>>>>>>>> Rmerge  (all I+ and I-)                    0.081 0.059     2.922
>>>>>>>>>>>>>> Rmeas (within I+/I-)                       0.081 0.058     2.940
>>>>>>>>>>>>>> Rmeas (all I+ & I-) 0.082 0.059     2.958
>>>>>>>>>>>>>> Rpim (within I+/I-)                        0.013 0.009     0.628
>>>>>>>>>>>>>> Rpim (all I+ & I-) 0.009 0.007     0.453
>>>>>>>>>>>>>> Rmerge in top intensity bin                0.050 -         -
>>>>>>>>>>>>>> Total number of observations             1265512 16212     53490
>>>>>>>>>>>>>> Total number unique                        17515 224      1280
>>>>>>>>>>>>>> Mean((I)/sd(I))                             29.7 104.3       1.5
>>>>>>>>>>>>>> Mn(I) half-set correlation CC(1/2)         1.000 1.000     0.778
>>>>>>>>>>>>>> Completeness                               100.0 99.7     100.0
>>>>>>>>>>>>>> Multiplicity                                72.3 72.4      41.8
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Anomalous completeness                     100.0 100.0     100.0
>>>>>>>>>>>>>> Anomalous multiplicity                      37.2 42.7      21.0
>>>>>>>>>>>>>> DelAnom correlation between half-sets      0.497 0.766    -0.026
>>>>>>>>>>>>>> Mid-Slope of Anom Normal Probability       1.039 -         -
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> (this is a good case for Rpim & CC-half as resolution limit
>>>>>>>>>>>>>> criteria)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If the statistics you want to use are there & some others
>>>>>>>>>>>>>> also, what is the pressure to remove them? Surely we want to
>>>>>>>>>>>>>> educate on how best to interpret the entire table above to
>>>>>>>>>>>>>> get a fuller picture of the overall quality of the data? My
>>>>>>>>>>>>>> 0th-order request would be to publish the three shells as
>>>>>>>>>>>>>> above ;o)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cheers Graeme
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 4 Jul 2017, at 22:09, Keller, Jacob <[log in to unmask]<mailto:[log in to unmask]>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I suggested replacing Rmerge/sym/cryst with Rmeas, not Rpim. Rmeas is simply (Rmerge * sqrt(n/n-1)) where n is the number of measurements of that reflection. It's merely a way of correcting for the multiplicity-related artifact of Rmerge, which is becoming even more of a problem with data sets of increasing variability in multiplicity. Consider the case of comparing a data set with a multiplicity of 2 versus one of 100: equivalent data quality would yield Rmerges diverging by a factor of ~1.4. But this has all been covered before in several papers. It can be and is reported in resolution bins, so can used exactly as you say. So, why not "disappear" Rmerge from the software?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The only reason I could come up with for keeping it is historical reasons or comparisons to previous datasets, but anyway those comparisons would be confounded by variabities in multiplicity and a hundred other things, so come on, developers, just comment it out!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> JPK
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>> From:
>>>>>>>>>>>>>> [log in to unmask]<mailto:[log in to unmask]
>>>>>>>>>>>>>> uk> [mailto:[log in to unmask]]
>>>>>>>>>>>>>> Sent: Tuesday, July 04, 2017 4:37 PM
>>>>>>>>>>>>>> To: Keller, Jacob
>>>>>>>>>>>>>> <[log in to unmask]<mailto:[log in to unmask]>>
>>>>>>>>>>>>>> Cc: [log in to unmask]<mailto:[log in to unmask]>
>>>>>>>>>>>>>> Subject: Re: [ccp4bb] Rmergicide Through Programming
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> HI Jacob
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Unbiased estimate of the true unmerged I/sig(I) of your data
>>>>>>>>>>>>>> (I find this particularly useful at low resolution) i.e. if
>>>>>>>>>>>>>> your inner shell Rmerge is 10% your data agree very poorly;
>>>>>>>>>>>>>> if 2% says your data agree very well provided you have
>>>>>>>>>>>>>> sensible multiplicity… obviously depends on sensible
>>>>>>>>>>>>>> interpretation. Rpim hides this (though tells you more about
>>>>>>>>>>>>>> the quality of average measurement)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Essentially, for I/sig(I) you can (by and large) adjust your sig(I) values however you like if you were so inclined. You can only adjust Rmerge by excluding measurements.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I would therefore defend that - amongst the other stats you
>>>>>>>>>>>>>> enumerate below - it still has a place
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cheers Graeme
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 4 Jul 2017, at 14:10, Keller, Jacob <[log in to unmask]<mailto:[log in to unmask]>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Rmerge does contain information which complements the others.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> What information? I was trying to think of a counterargument to what I proposed, but could not think of a reason in the world to keep reporting it.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> JPK
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 4 Jul 2017, at 12:00, Keller, Jacob <[log in to unmask]<mailto:[log in to unmask]><mailto:[log in to unmask]>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Dear Crystallographers,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Having been repeatedly chagrinned about the continued use and reporting of Rmerge rather than Rmeas or similar, I thought of a potential way to promote the change: what if merging programs would completely omit Rmerge/cryst/sym? Is there some reason to continue to report these stats, or are they just grandfathered into the software? I doubt that any journal or crystallographer would insist on reporting Rmerge per se. So, I wonder what developers would think about commenting out a few lines of their code, seeing what happens? Maybe a comment to the effect of "Rmerge is now deprecated; use Rmeas" would be useful as well. Would something catastrophic happen?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> All the best,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Jacob Keller
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> *******************************************
>>>>>>>>>>>>>> Jacob Pearson Keller, PhD
>>>>>>>>>>>>>> Research Scientist
>>>>>>>>>>>>>> HHMI Janelia Research Campus / Looger lab
>>>>>>>>>>>>>> Phone: (571)209-4000 x3159
>>>>>>>>>>>>>> Email:
>>>>>>>>>>>>>> [log in to unmask]<mailto:[log in to unmask]><ma
>>>>>>>>>>>>>> ilto:[log in to unmask]>
>>>>>>>>>>>>>> *******************************************
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> This e-mail and any attachments may contain confidential, copyright and or privileged material, and are for the use of the intended addressee only. If you are not the intended addressee or an authorised recipient of the addressee please notify us of receipt by returning the e-mail and do not use, copy, retain, distribute or disclose the information in or attached to the e-mail.
>>>>>>>>>>>>>> Any opinions expressed within this e-mail are those of the individual and not necessarily of Diamond Light Source Ltd.
>>>>>>>>>>>>>> Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments are free from viruses and we cannot accept liability for any damage which you may sustain as a result of software viruses which may be transmitted in or with the message.
>>>>>>>>>>>>>> Diamond Light Source Limited (company no. 4375679).
>>>>>>>>>>>>>> Registered in England and Wales with its registered office
>>>>>>>>>>>>>> at Diamond House, Harwell Science and Innovation Campus,
>>>>>>>>>>>>>> Didcot, Oxfordshire, OX11 0DE, United Kingdom
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>
>> -- 
>> John Berrisford
>> PDBe
>> European Bioinformatics Institute (EMBL-EBI)
>> European Molecular Biology Laboratory
>> Wellcome Trust Genome Campus
>> Hinxton
>> Cambridge CB10 1SD UK
>> Tel: +44 1223 492529
>>
>> http://www.pdbe.org
>> http://www.facebook.com/proteindatabank
>> http://twitter.com/PDBeurope

-- 
Peter Keller                             Tel.: +44 (0)1223 353033
Global Phasing Ltd.,                     Fax.: +44 (0)1223 366889
Sheraton House,
Castle Park,
Cambridge CB3 0AX
United Kingdom

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

April 2024
March 2024
February 2024
January 2024
December 2023
November 2023
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
December 2022
November 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager