How does the the total number of unique reflections compare in the two sets of diffraction experiments (2.6 and 1.8 A)? I guess the added number of unique observations;(N) in case of the later high res experiment had contributed to the better electron density maps. Needless to mention the structure factor and electron densities expressions involve a summation term over all observations in the diffraction experiment. Ashok CSIR-CDRI Lucknow, India On Sat, Nov 28, 2015 at 10:12 AM, James Phillips <[log in to unmask] > wrote: > You cannot go wrong by adding more data, especially if it is weighted by > its sigma in any experimental analysis, as many other commenters have said. > > For crystallography, the higher resolution reflections are higher > frequency components of the Fourier transform therefore sharpen the picture > (electron density) even if they are down weighted. > > On Sat, Nov 28, 2015 at 10:55 AM, Gerard Bricogne <[log in to unmask]> > wrote: > >> Dear Phil, >> >> I think you are getting close to the central question, but I am >> not sure that I agree totally with your way of formulating it. That >> formulation is in line with the "new paradigm" that you can claim as >> high a resolution as you wish provided (1) some numbers associated >> with the implied range of (h,k,l) have been produced by processing a >> set of images and show a minimal degree of internal consistency, and >> (2) feeding those numbers into a refinement program doesn't worsen >> your refinement statistics compared with a run of the same refinement >> program against a more restricted set of numbers defined by a lower >> resolution limit. >> >> It has from the beginning worried me that this could lead to some >> sort of "quantitative easing" for our fundamental common currency of >> "resolution as a guarantor of structure quality". However shaky that >> currency may have been before, this new definition seems to leave >> perhaps even more room for "creative accounting" than the previous >> one. >> >> Why should we worry about the risks of inflation for the new >> minting of that currency? An obvious one is that various "quality >> percentiles" indicated for PDB entries at deposition time are based on >> "other structures at similar resolution", so that any redefinition of >> that criterion will cause inflation and loss of discriminating power >> as a quality indicator. Whatever the justification or not of relying >> on it, it is used widely in various forms of data mining, and it isn't >> a minor matter to let it float. >> >> There are more subtle and less "bean-counting" arguments >> involved, though. If I recall Keith Wilson's famous jibe at people >> claiming to collect data to a resolution to which they were only >> "collecting indices", it would apply directly here in the form of >> asking whether you are really feeding more data into your refinement, >> or only more indices. At first sight, there extra indices should be >> fairly innocuous (Ian made the point that refinement methods have >> become relatively robust to the associated "data" if they are bogus) >> but there can be side-effects that don't immediately come to mind. For >> example, as the range of these indices extends further, the Fourier >> calculations will be done on finer grids in real space. The usual maps >> will look nicer, but that wouldn't affect the refinement statistics. >> What could affect the latter, however, it that a more finely sampled >> log-likelihood gradient map would lead to more accurate calculation of >> partial derivatives by the Agarwal-Lifchitz method for applying the >> chain-rule in real space, and therefore provide the optimiser with >> good gradient information for longer along the refinement path that a >> coarser sampling would. What effect that would have would depend on >> many factors (what optimiser is used, for how many cycles it runs, >> what the convergence/stopping criteria are, ...). Such numerical >> side-effects of providing more indices rather than more data have not, >> to my knowledge, been systematically investigated to produce a >> "baseline" of refinement improvement that should be subtracted from >> whatever other effects one wants to attribute to the actual purported >> data associated with those extra indices. Until this is done, we run >> the risk of thinking that we are producing a higher-resolution >> structure when all we have done is remediate the ill-effects of an >> insufficient sampling rate in the Agarwal-Lifchitz method at a lower >> effective data resolution. >> >> I will try and conclude this long message by as short a sentence >> as you proposed, Phil. Perhaps the most relevant question about the >> true operational definition of resolution is: what is the resolution >> such that by cutting back the data further, you start to degrade your >> model? In other way, it is the resolution of the *necessary* data to >> bring the model sufficiently near its asymptote of quality. Of course, >> as an asymptote is never reached, there will always be room for >> negociation and bartering. >> >> Perhaps the more substantive questions are those I have alluded >> to, about subtracting a baseline of e.g. Fourier-related side-effects >> so that we do not mistake an increase in the numerical performance of >> refinement algorithms against data to a given resolution for an extra >> ability to exploit data to a notionally higher resolution. I would be >> delighted to hear that this has been, or is being, investigated. >> >> Finally, anticipated apologies to Kay and Andy for bringing up >> "quantitative easing" in the context of possible abuses of CC1/2 for >> choosing a claimed resolution limit: it isn't a criticism but a >> genuine concern. An obvious benefit is that it forces us once more to >> question whether we really know what resolution means, or are just >> following old habits that have become enshrined by the compilation of >> statistics based on them. >> >> >> With best wishes, >> >> Gerard. >> >> -- >> On Sat, Nov 28, 2015 at 03:38:48PM +0000, Phil Evans wrote: >> > The basic question for reviewers (and yourself) is “do you think that >> cutting back the resolution will improve your model?" >> > >> > > On 28 Nov 2015, at 15:23, Greenstone talis < >> [log in to unmask]> wrote: >> > > >> > > Thank you for your replies and discussion around this! >> > > >> > > Ian, >> > > yes, the quality of the maps clearly say that I can definitely use >> more data from the higher resolution bins. But I have the feeling that the >> numbers at 1.8A (or even 2.2A) would cause many rejections from reviewers, >> thinking of a potential publication. >> > > >> > > Eleanor, >> > > as suggested, I performed a new round of refinement, omitting here >> and there, some random residues. Attached is sample of the result. But I >> need to ask that if these maps were biased, why would there be so many good >> difference maps for absent waters in the model? >> > > >> > > Jonny, >> > > same as above, I can trust my reflections at higher resolution bins, >> but I will have to convince others..Also, I would think that if I define >> the boundaries of my data during the indexing and integration to certain >> resolutions, data beyond those limits would just be considered absent >> rather than being consider >> > > waves with amplitudes = 0? >> > > >> > > Thank you again >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > On Sat, Nov 28, 2015 at 2:39 PM, Jonathan Brooks-Bartlett < >> [log in to unmask]> wrote: >> > > Hi Talis, >> > > >> > > I am far from a refinement expert but I'll chip in with my thoughts >> on why this is, which may be wrong but the worst that can happen is that >> someone corrects me and I learn something new. >> > > >> > > A very simplistic and naive interpretation is that by including the >> data up to 1.8A you are including more information and so you are getting >> better information out. >> > > >> > > But why is this the case? >> > > >> > > The electron density equation tells us that to get the electron >> density at each point in space we have to sum over all of amplitudes and >> phases (it's a Fourier transform), so we have to make sure we obtain the >> correct values for these quantities to obtain the correct electron density. >> If you cut your data at 2.6A then you completely leave out any extra >> information that you obtain from reflections out to 1.8A. But the real >> problem with this is when it comes to the electron density equation. Any >> "missing" information is encoded as the amplitude being 0, which is very >> likely to be WRONG! So we don't treat the data as missing, we just say that >> the amplitude is 0. >> > > So the reason why I think the 1.8A data is a bit better, despite >> worse data quality stats, is because the contribution to the electron >> density equation is non zero for the reflection amplitudes out to 1.8A. >> Although the contributions may bot be perfect (the data quality isn't >> great) it's a better estimate than just setting the amplitudes to zero. >> > > >> > > This leads on to the question "what is resolution?" >> > > My interpretation of resolution is that it is a semi-quantitative >> measure of the amount of terms used in the electron density equation. >> > > >> > > So the more terms you use in the electron density equation (higher >> resolution), the better the electron density representation of your >> protein. So as long as you trust the measurements of your reflections you >> should use them in the processing (this is why error values are important), >> because otherwise you'll set the contribution in the electron density >> equation to 0 (which is likely to be wrong anyway). >> > > >> > > But I would wait for a more experienced crystallographer than me >> confirm whether anything I've stated actually makes sense or not. >> > > >> > > This is my 2p ;) >> > > >> > > Jonny Brooks-Bartlett >> > > Garman Group >> > > DPhil candidate Systems Biology Doctoral Training Centre >> > > Department of Biochemistry >> > > University of Oxford >> > > From: CCP4 bulletin board [[log in to unmask]] on behalf of >> Eleanor Dodson [[log in to unmask]] >> > > Sent: 28 November 2015 13:12 >> > > To: [log in to unmask] >> > > Subject: Re: [ccp4bb] Puzzled: worst statistics but better maps? >> > > >> > > I am not surprised - Your CC1/2 is very high at 2.6A and there must >> be lots of information past that resolution.. >> > > Maybe the 1.8A cut off is unrealistic, but some of that extra data >> will certainly have helped .. >> > > >> > > But the map appearance over modelled residues can be misleadingly >> good. Remember al the PHASES are calculated from the given model so a >> reflection with any old amplitude rubbish will have some signal . >> > > A better test is to omit a few residues from the phasing and see >> where you get the best density for the omitted segment of the structure >> > > >> > > Eleanor >> > > >> > > On 28 November 2015 at 11:53, Ian Tickle <[log in to unmask]> wrote: >> > > >> > > Hi, IMO preconceived notions of where to apply a resolution cut-off >> to the data are without theoretical foundation and most likely wrong. You >> may decide empirically based on a sample of data what are the optimal >> cut-off criteria but that doesn't mean that the same criteria are generally >> applicable to other data. Modern refinement software is now sufficiently >> advanced that the data are automatically weighted to enhance the effect of >> 'good' data on the results relative to that of 'bad' data. Such a >> continuous weighting function is likely to be much more realistic from a >> probabilistic standpoint than the 'Heaviside' step function that is >> conventionally applied. The fall-off in data quality with resolution is >> clearly gradual so why on earth should the weight be a step function? >> > > >> > > Just my 2p. >> > > >> > > Cheers >> > > >> > > -- Ian >> > > >> > > >> > > On 28 November 2015 at 11:21, Greenstone talis < >> [log in to unmask]> wrote: >> > > Dear All, >> > > >> > > >> > > I initially got a 3.0 A dataset that I used for MR and refinement. >> Some months later I got better diffracting crystals and refined the >> structure with a new dataset at 2.6 A (for this, I preserved the original >> Rfree set). >> > > >> > > >> > > Even though I knew I was in a reasonable resolution limit already, I >> was curious and I processed the data to 1.8 A and used it for refinement >> (again, I preserved the original Rfree set). I was surprised to see that >> despite the worst numbers, the maps look better (pictures and some numbers >> attached). >> > > >> > > >> > > 2.6 A dataset: >> > > >> > > Rmeas: 0.167 (0.736) >> > > >> > > I/sigma: 9.2 (2.2) >> > > >> > > CC(1/2): 0.991 (0.718) >> > > >> > > Completeness (%): 99.6 (99.7) >> > > >> > > >> > > 1.8 A dataset: >> > > >> > > Resolution: 1.8 A >> > > >> > > Rmeas: 0.247 (2.707) >> > > >> > > I/sigma: 5.6 (0.3) >> > > >> > > CC(1/2): 0.987 (-0.015) >> > > >> > > Completeness (%): 66.7 (9.5) >> > > >> > > >> > > >> > > I was expecting worst maps with the 1.8 A dataset...any explanations >> would be very appreciated. >> > > >> > > >> > > Thank you, >> > > >> > > Talis >> > > >> > > >> > > >> > > >> > > >> > > <Ile_Omitted.jpg> >> >> -- >> >> =============================================================== >> * * >> * Gerard Bricogne [log in to unmask] * >> * * >> * Global Phasing Ltd. * >> * Sheraton House, Castle Park Tel: +44-(0)1223-353033 * >> * Cambridge CB3 0AX, UK Fax: +44-(0)1223-366889 * >> * * >> =============================================================== >> > > -- Ashok Senior Research Fellow - Dr JV Pratap, Lab No-LSN 008 Molecular and Structural Biology Division Central Drug Research Institute, Janakipuram Extension Lucknow-226031 India