JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for CCP4BB Archives


CCP4BB Archives

CCP4BB Archives


CCP4BB@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

CCP4BB Home

CCP4BB Home

CCP4BB  January 2015

CCP4BB January 2015

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Re: Free Reflections as Percent and not a Number

From:

"Edward A. Berry" <[log in to unmask]>

Reply-To:

Edward A. Berry

Date:

Sun, 4 Jan 2015 12:50:00 -0500

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (289 lines)

On 11/25/2014 01:41 PM, Tim Gruene wrote:
> Hi Ed,
>
> it is an easy excercise to show that theory (according to "by
> definition") and reality greatly diverge - refinement is too complex to
> get back to exactly the same structure. Maybe because one often does not
> reach convergence, no matter how  many cycles of refinement you run.

Yes- i was able to convince myself of this.

I took a structure which I considered was refined to convergencee.
I didn't see any way to refine with no free set in phenix, even with
least-squares target function, so I refined against a newly chosen free set,
should test the same principle. Turned off HQN flips and real space refinement.
After 2 rounds of 3 macrocycles each of individual ADP and XYZ refinement,
comparing the structure with the original gave all-atom RMSD of 0.0540 A,
maximum displacement 1.0161 A. Should be within radius of convergence, right?

So then I returned to the original free set in order to let it refine back
to the original position. R-free started out equal to R but soon increased to
approximately the original value., 0.2011 0.2256 vs original 0.2036 0.2278
(this is a 1.8A structure). So far so good.

But looking at RMSD compared to original, the numbers never decreased,
they continue to increase with every cycle. The structure is not returning to the
original but finding equally good solutions in the neighborhood. I guess it is
meandering about an essentially flat plateau (or better, flat-bottomed valley).
Does not reach convergence, no matter how  many cycles of refinement you run.
eab


>
> Best,
> Tim
>
> On 11/25/2014 07:29 PM, Edward A. Berry wrote:
>>> provided the jiggling keeps the structure inside the convergence
>>> radius of refinement, then by definition the refinement will produce
>>> the same result irrespective of the starting point (i.e. jiggled or
>>> not).  If the jiggling takes the structure outside the radius of
>>> convergence then the original structure will not be retrievable
>>> without manual rebuilding: I'm assuming that's not the goal here.
>>
>>
>> I actually agree with this, but an R-free purist might argue that you
>> have to get outside of radius of convergence to eliminate R-free bias.
>> Otherwise, by definition, "you will just refine back to the same old
>> biased structure!".
>>    (but you have shown that the conventional .2A rms is within radius of
>> convergence)
>>
>> In fact Dale's concern about low-res reflections could be put in terms
>> of radius of convergence and false minima.
>> Moving a lot of atoms by .2 A will have a significant effect on the
>> phase of a 2A reflection, but almost no effect on a 20A reflection. Say
>> you have refined against all the low resolution reflections, and got a
>> structure that fits better than it should because it is fitting the
>> noise in the free reflections. Now take away the free reflections and
>> continue to refine. It will drop into the nearest local minimum, which
>> since it is near the solution with all reflections, will still give
>> artificially low R-free.  Jiggling by 0.2 A will have no effect because
>> the local minima are are extremely broad and shallow, as far as the
>> low-res reflections go.
>>
>> But then you could say that since any local minima are so broad, all
>> structures that are even slightly reasonable, (including the correct
>> one) will be within radius of convergence of the same minimum as far as
>> the low-res reflections are concerned. The nearest false minimum
>> involves moving atoms by 5-10 A, so within reason the convergence point
>> will be completely independent of the starting structure. Presumably
>> this is why Phenix rigid body refinement starts out at ultra-low
>> resolution: to increase the radius of convergence. From that
>> perspective, rather than being the worrisome part, the low-resolution is
>> the region where we can assume Ian's assumption is correct.
>>
>> What about another experiment, which I think we've discussed before.
>> Take a structure refined to convergence with a pristine free set. Now
>> refine to convergence against all the data. The purist will say that the
>> free set is hopelessly corrupted. And sure enough when we take that
>> structure and calculate free-R with the original set, R-free is same as
>> R-work within statistical significance.  But- I guess adding the extra
>> 5% reflections will not change any atomic position by more than 0.2 A
>> (maybe 0.02A), and so we are still well within radius of convergence of
>> the original unbiased structure. Refining against the original working
>> set will give back that unbiased structure, and Rfree will return to it
>> original value.
>>
>> This suggest, if the only purpose of Rfree is to get a number to deposit
>> with the pdb (which it is not), you should first solve your structure
>> using all the data, fitting the noise; then exclude a free set and back
>> off on fitting the noise of it to get the R-free.  The only problem
>> would be that during the refinement without guidance of R-free, you may
>> have engaged in some practice that hurt the structure so much that it
>> ends up out of RoC of the well-refined structure. Not because you were
>> fitting the noise (anyway you are fitting the noise in your 95% working
>> set) but because you would not have been warned that some procedure was
>> not helping.
>>
>> Very provocative discussion!
>> eab
>>
>>
>> On 11/25/2014 11:03 AM, Ian Tickle wrote:
>>> Dear All
>>>
>>> I'd like to raise the question again of whether any of this 'jiggling'
>>> (i.e. addition of random noise to the co-ordinates) is really
>>> necessary anyway, notwithstanding Dale's valid point that even if it
>>> were necessary, jiggling in its present incarnation is unlikely to
>>> work because it's unlikely to erase the influence of low res. reflexions.
>>>
>>> My claim is that jiggling is completely unnecessary, because I
>>> maintain that refinement to convergence is alI that is required to
>>> remove the bias when an alternate test set is selected.  In fact I
>>> claim that it's the refinement, not the jiggling, that's wholly
>>> responsible for removing the bias.  I know we thrashed this out a
>>> while back and I recall that the discussion ended with a challenge to
>>> me to prove my claim that the refine-only Rfrees are indeed unbiased.
>>> I couldn't see an easy way of doing this which didn't involve
>>> rebuilding and re-refining the same structure 20 times over, without
>>> introducing any observer bias.
>>>
>>> The present discussion prompted me to think again about this and I
>>> believe I can prove part of my claim quite easily, that jiggling has
>>> no effect on the results.  Proving that the resulting Rfrees are
>>> unbiased is much harder, since as we've seen there's no proof that
>>> jiggling actually removes the bias as claimed by its proponents.
>>> However given that said proponents of jiggling+refinement have been
>>> happy to accept for many years that their results are unbiased, then
>>> they must be equally happy now to accept that the refinement-only
>>> results are also unbiased, provided I can demonstrate that the
>>> difference between the results is insignificant.
>>>
>>> The experimental proof rests on comparison between the Rfrees and
>>> RMSDs of the jiggled+refined and the refined-only structures for the
>>> 19 possible alternate test sets (assuming 5% test-set size).  If
>>> jiggling makes no difference as I claim then there should be no
>>> significant difference between the Rfrees and insignificant RMSDs for
>>> all pairs of alternate test sets.
>>>
>>> However, first we must be careful to establish what is a suitable
>>> value for the noise magnitude to add to the co-ordinates.  If it's too
>>> small it won't remove the bias (again notwithstanding Dale's point
>>> that it's unlikely to have any effect anyway on the low res. data);
>>> too large and you push it beyond the convergence radius of the
>>> refinement and end up damaging the structure irretrievably (at least
>>> unless you're prepared to do significant rebuilding of the model).
>>>
>>> For the record here's the crystal info for the test data I selected:
>>>
>>> Nres: 96   SG: P41212   Vm: 1.99   Solvent: 0.377
>>> Resol: 40-1.58 A.
>>> Working set size: 11563   Test set size: 611 (5%)   Test set: 0
>>> Refinement program:     BUSTER.
>>> Noise addition program: PDBSET.
>>>
>>> It's wise to choose a small protein since you need to run lots of
>>> refinements!  However feel free to try the same thing with your own data.
>>>
>>> First I took care that the starting model was refined to convergence
>>> using the original test set 0, and I performed 2 sequential runs of
>>> refinement with BUSTER (the deviations are relative to the input
>>> co-ordinates in each case):
>>>
>>> Ncyc  Rwork   Rfree   RMSD MaxDev
>>>     82     0.181  0.230     0.005   0.072
>>>     51     0.181  0.231     0.002   0.015
>>>
>>> The advantage of using BUSTER is that it has its own convergence test;
>>> with REFMAC you have to guess.
>>>
>>> Then I tried a range of input noise values (0.20, 0.25. 0.30, 0.35,
>>> 0.40, 0.50 A) on the refined starting model.  Note that these are
>>> RMSDs, not maximum shifts as claimed by the PDBSET documentation.  In
>>> each case I did 4 sequential runs of BUSTER on the jiggled
>>> co-ordinates and by looking at the RMSDs and max. shifts I decided
>>> that 0.25 A RMSD was all the structure could stand without risking
>>> permanent damage (note that the default noise value in PDBSET is 0.2):
>>>
>>> Initial RMSD: 0.248  MaxDev: 0.407
>>>
>>> Ncyc  Rwork   Rfree   RMSD  MaxDev
>>>    358    0.183   0.230    0.052    0.454
>>>    126    0.181   0.232    0.041    0.383
>>>      65    0.181   0.232    0.040    0.368
>>>      50    0.181   0.232    0.040    0.360
>>>
>>> The only purpose of the above refinements is to establish the most
>>> suitable noise value; the resulting refined PDB files were not used.
>>>
>>> So then I took the co-ordinates with 0.25 A noise added and for each
>>> test set 1-19 did 2 sequential runs of BUSTER.
>>>
>>> Finally I took the original refined starting model (i.e. without noise
>>> addition) and again refined to convergence using all 19 alternate test
>>> sets.
>>>
>>> The results are attached.  The correlation coefficient between the 2
>>> sets of Rfrees is 0.992 and the mean RMSD between the sets is 0.04 A,
>>> so the difference between the 2 sets is indeed insignificant.
>>>
>>> I don't find this result surprising at all: provided the jiggling
>>> keeps the structure inside the convergence radius of refinement, then
>>> by definition the refinement will produce the same result irrespective
>>> of the starting point (i.e. jiggled or not).  If the jiggling takes
>>> the structure outside the radius of convergence then the original
>>> structure will not be retrievable without manual rebuilding: I'm
>>> assuming that's not the goal here.
>>>
>>> I suspect that the idea of jiggling may have come about because
>>> refinements have not always been carried through to convergence:
>>> clearly if you don't do a proper job of refinement then you must
>>> expect some of the original bias to remain.  Also to head off the
>>> suggestion that simulated annealing refinement would fix this I would
>>> suggest that any kind of SA refinement is only of value for initial MR
>>> models when there may be significant systematic error in the model;
>>> it's not generally advisable to perform it on final refined models
>>> (jiggled or not) when there is no such systematic error present.
>>>
>>> Cheers
>>>
>>> -- Ian
>>>
>>>
>>> On 21 November 2014 18:56, Dale Tronrud <[log in to unmask]
>>> <mailto:[log in to unmask]>> wrote:
>>>
>>
>>
>> On 11/21/2014 12:35 AM, "F.Xavier Gomis-RĂ¼th" wrote:
>>   > <snip...>
>>
>>> As to the convenience of carrying over a test set to another
>>> dataset, Eleanor made a suggestion to circumvent this necessity
>>> some time ago: pass your coordinates through pdbset and add some
>>> noise before refinement:
>>
>>> pdbset xyzin xx.pdb xyzout yy.pdb <<eof noise 0.4 eof
>>
>>
>>      I've heard this "debiasing" procedure proposed before, but I've
>> never seen a proper test showing that it works.  I'm concerned that
>> this will not erase the influence of low resolution reflections that
>> were in the old working set but are now in the new test set.  While
>> adding 0.4 A gaussian noise to a model would cause large changes to
>> the 2 A structure factors I doubt it would do much to those at 10 A.
>>
>>      It seems to me that one would have to have random, but
>>>> correlated,
>> shifts in atomic parameters to affect the low resolution data - waves
>> of displacements, sometimes to the left and other times to the right.
>>    You would need, of course, a superposition of such waves that span
>> all the scales of resolution in the data set.
>>
>>      Has anyone looked at the pdbset jiggling results and shown
>>>> that the
>> low resolution data are scrambled?
>>
>> Dale Tronrud
>>
>>> Xavier
>>
>>> On 20/11/14 11:43 PM, Keller, Jacob wrote:
>>>> Dear Crystallographers,
>>
>>>> I thought that for reliable values for Rfree, one needs only to
>>>> satisfy counting statistics, and therefore using at most a couple
>>>> thousand reflections should always be sufficient. Almost always,
>>>> however, some seemingly-arbitrary percentage of reflections is
>>>> used, say 5%. Is there any rationale for using a percentage
>>>> rather than some absolute number like 1000?
>>
>>>> All the best,
>>
>>>> Jacob
>>
>>>> ******************************************* Jacob Pearson Keller,
>>>> PhD Looger Lab/HHMI Janelia Research Campus 19700 Helix Dr,
>>>> Ashburn, VA 20147 email:[log in to unmask]
>>>> <mailto:[log in to unmask]>
>>>> ******************************************* .
>>
>>
>>> --
>>>
>>>
>>
>

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

April 2024
March 2024
February 2024
January 2024
December 2023
November 2023
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
December 2022
November 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager