JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for CCP4BB Archives


CCP4BB Archives

CCP4BB Archives


CCP4BB@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

CCP4BB Home

CCP4BB Home

CCP4BB  September 2008

CCP4BB September 2008

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Re: truncate ignorance

From:

Ian Tickle <[log in to unmask]>

Reply-To:

Ian Tickle <[log in to unmask]>

Date:

Tue, 9 Sep 2008 13:04:20 +0100

Content-Type:

multipart/mixed

Parts/Attachments:

Parts/Attachments

text/plain (378 lines) , tbint.out (378 lines)


> -----Original Message-----
> From: Bart Hazes [mailto:[log in to unmask]] 
> Sent: 08 September 2008 23:44
> To: Ian Tickle
> Cc: [log in to unmask]
> Subject: Re: [ccp4bb] truncate ignorance
> 
> 
> How a seemingly innocent question can explode ...

Well there seems to be a widespread misunderstanding of the French &
Wilson (Truncate) procedure!

> 
> I actually thought I understood this but little of what has been 
> discussed matches my "mental picture" of the truncate process.
> 
> Truncate can do multiple things, but the truncate part I 
> believe really 
> just deals with converting I to F and the inherent problems due to 
> experimental error and mathematical problems in deriving SigF 
> from SigI 
> when I is near zero. This only depends on how close I is to zero 
> (relative to SigI), and not on the Wilson distribution itself.

Not sure what you mean here: in the French & Wilson procedure as
currently implemented by Truncate the corrected F & sigF will depend on
the values of I, sigI and the Wilson distribution parameter, Sigma.
This is assumed not to be a constant for the whole data set, as is
implied by a linear Wilson plot, but instead is obtained by spline
fitting to the log(average I) in resolution shells.  This is fully
described in the F&W paper itself of course, but I find that here is a
more succinct summary of the procedure:
http://xtal.sourceforge.net/man/bayest-desc.html .  This also has the
advantage that it covers the case of obtaining the corrected I & sigI
which F&W (or Truncate itself) doesn't.

I did some illustrative calculations (see attachment) using the
equations at the BAYEST site so you can see the effect of varying the
WDP and the measured intensity.  I used the numerical integration
routine DQAGI (http://www.netlib.org/slatec/src/dqagi.f) to do the
integrals (which cannot unfortunately be expressed in closed form), and
for simplicity and to keep the number of variables to a minimum I
assumed that sigI = 1 throughout (this does not imply any loss of
generality since you can simply replace I by I/sigI in the table).

> 
> My mental picture is as follows:
> 
> Visualize a gaussian distribution representing I and its standard 
> deviation, with I being close to zero (either positive or negative). 
> Part of the gaussian will stretch into negative-I territory, which is 
> fine for the experimental I (because of experimental error) 
> but not the 
> true I. Given this prior knowledge you can re-estimate I by 
> TRUNCATEing 
> the negative tail of the gaussian and integrating just the 
> positive part 
> to find the new mean and standard deviation. As a result any 
> reflection 
> will become positive (including those starting out with 
> negative I). The 
> extend to which the method affects the intensity depends on 
> how much of 
> a negative tail it has, so nearly no effect on I/SigI>=2 
> reflections and 
> not really that much on even I/SigI=2 reflections.
> 
> I actually think this is a very elegant solution. The only 
> thing better, 
> is to use I directly and avoid the entire issue. I personally 
> think you 
> want to use the experimental I without correcting it as 
> explained above 
> since it will introduce bias and the refinement procedure should take 
> proper care of random experimental error, unless you mess around with 
> it. However, when you need amplitudes, truncate is the way to go.

The procedure you describe (which is the same as the one Peter Zwart
described for phenix.reflection_file_converter and the one used in the
Sivia & David paper) will indeed introduce bias in both I and F because
it uses an improper prior, i.e. it implicitly assumes that an infinite
intensity is as likely as any other, whereas in reality of course an
infinite intensity is physically impossible; this is taken care of by
the Wilson distribution (for acentrics) P(I) = exp(-I/S).  In practice
what this will mean is that the correction added to I is always positive
(all I's are shifted towards the average prior I at +infinity).  So in
fact one can do a lot better and use the F&W/Truncate solution instead
(which I think is indeed elegant since it produces exactly the result we
intuitively expect).  As can be seen from the table no bias is
introduced for the case where the WDP and hence Itrue is exactly zero:
the corrected intensity is also exactly zero.  The correction added to I
is always positive for I < 0 as it must be, but it's negative for large
I's, so the net effect is that average I is unbiased.

-- Ian
> 
> Bart
> 
> Ian Tickle wrote:
> > But there's a fundamental difference in approach, the authors here
> > assume the apparently simpler prior distribution P(I) = 0 
> for I < 0 &
> > P(I) = const for I >= 0.  As users of Bayesian priors well 
> know this is
> > an improper prior since it integrates to infinity instead of unity.
> > This means that, unlike the case I described for the French & Wilson
> > formula based on the Wilson distribution which gives 
> unbiased estimates
> > of the true I's and their average, the effect on the corrected
> > intensities of using this prior really will be to increase all
> > intensities (since the mean I for this prior PDF is also infinite!),
> > hence the intensities and their average must be biased (& 
> I'm sure the
> > same goes for the corresponding F's).  But as you say in 
> practice the
> > errors introduced may well not be significant compared with those
> > introduced by (for example) deconvoluting the overlapping 
> peaks in the
> > powder pattern.  Also I'm not sure the F vs I argument can 
> be carried
> > over from the powder to the single crystal case because the kinds of
> > errors encountered in each case are quite different.
> > 
> > -- Ian
> > 
> > 
> >>-----Original Message-----
> >>From: [log in to unmask] 
> >>[mailto:[log in to unmask]] On Behalf Of 
> [log in to unmask]
> >>Sent: 08 September 2008 22:20
> >>To: Jacob Keller
> >>Cc: [log in to unmask]
> >>Subject: Re: [ccp4bb] truncate ignorance
> >>
> >>I would also recommend reading of the following paper:
> >>
> >>D.S. Sivia & W.I.F. David (1994), Acta Cryst. A50, 703-714. A 
> >>Bayesian  
> >>Approach to Extracting Structure-Factor Amplitudes from Powder  
> >>Diffraction Data.
> >>
> >>Despite of the title, most of the analysis presented in this paper  
> >>applies equally well to single-crystal data (see especially 
> >>sections 3  
> >>and 5). If you are not interested in the specific 
> powder-diffraction  
> >>problems (i.e. overlapping peaks), you can simply skip 
> >>sections 4 and 6.
> >>
> >>A few interesting points from this paper :
> >>
> >>(1) The conversion from I's to F's can be done (in a Bayesian 
> >>way) by  
> >>applying two simple formula (equations 11 and 12 in the 
> >>paper), which,  
> >>for all practical purposes, are as valid as the more complicated  
> >>French & Wilson procedure (see discussion in section 5).
> >>
> >>(2) Re. the use of I's rather than F's : this is discussed on 
> >>page 710  
> >>(final part of section 5). The authors seem to be more in favor of  
> >>using F's.
> >>
> >>
> >>
> >>Marc Schiltz
> >>
> >>
> >>
> >>
> >>
> >>Quoting Jacob Keller <[log in to unmask]>:
> >>
> >>
> >>>Does somebody have a .pdf of that French and Wilson paper?
> >>>
> >>>Thanks in advance,
> >>>
> >>>Jacob
> >>>
> >>>*******************************************
> >>>Jacob Pearson Keller
> >>>Northwestern University
> >>>Medical Scientist Training Program
> >>>Dallos Laboratory
> >>>F. Searle 1-240
> >>>2240 Campus Drive
> >>>Evanston IL 60208
> >>>lab: 847.491.2438
> >>>cel: 773.608.9185
> >>>email: [log in to unmask]
> >>>*******************************************
> >>>
> >>>----- Original Message -----
> >>>From: "Ethan Merritt" <[log in to unmask]>
> >>>To: <[log in to unmask]>
> >>>Sent: Monday, September 08, 2008 3:03 PM
> >>>Subject: Re: [ccp4bb] truncate ignorance
> >>>
> >>>
> >>>
> >>>>On Monday 08 September 2008 12:30:29 Phoebe Rice wrote:
> >>>>
> >>>>>Dear Experts,
> >>>>>
> >>>>>At the risk of exposing excess ignorance, truncate makes me
> >>>>>very nervous because I don't quite get exactly what it is
> >>>>>doing with my data and what its assumptions are.
> >>>>>
> >>>>>From the documentation:
> >>>>>========================================================
> >>>>>... the "truncate" procedure (keyword TRUNCATE YES, the
> >>>>>default) calculates a best estimate of F from I, sd(I), and
> >>>>>the distribution of intensities in resolution shells (see
> >>>>>below). This has the effect of forcing all negative
> >>>>>observations to be positive, and inflating the weakest
> >>>>>reflections (less than about 3 sd), because an observation
> >>>>>significantly smaller than the average intensity is likely
> >>>>>to be underestimated.
> >>>>>=========================================================
> >>>>>
> >>>>>But is it really true, with data from nice modern detectors,
> >>>>>that the weaklings are underestimated?
> >>>>
> >>>>It isn't really an issue of the detector per se, although in
> >>>>principle you could worry about non-linear response to the
> >>>>input rate of arriving photons.
> >>>>
> >>>>In practice the issue, now as it was in 1977 (French&Wilson),
> >>>>arises from the background estimation, profile fitting, and
> >>>>rescaling that are applied to the individual pixel contents
> >>>>before they are bundled up into a nice "Iobs".
> >>>>
> >>>>I will try to restate the original French & Wilson argument,
> >>>>avoiding the terminology of maximum likelihood and 
> >>
> >>Bayesian statistics.
> >>
> >>>>1) We know the true intensity cannot be negative.
> >>>>2) The existence of Iobs<0 reflections in the data set means
> >>>>  that whatever we are doing is producing some values of
> >>>>  Iobs that are too low.
> >>>>3) Assuming that all weak-ish reflections are being processed
> >>>>  equivalently, then whatever we doing wrong for reflections with
> >>>>  Iobs near zero on the negative side surely is also going wrong
> >>>>  for their neighbors that happen to be near Iobs=0 on 
> the positive
> >>>>  side.
> >>>>4) So if we "correct" the values of Iobs that went negative, for
> >>>>  consistency we should also correct the values that are nearly
> >>>>  the same but didn't quite tip over into the negative range.
> >>>>
> >>>>
> >>>>>Do I really want to inflate them?
> >>>>
> >>>>Yes.
> >>>>
> >>>>
> >>>>>Exactly what assumptions is it making about the expected
> >>>>>distributions?
> >>>>
> >>>>Primarily that
> >>>>1) The histogram of true Iobs is smooth
> >>>>2) No true Iobs are negative
> >>>>
> >>>>
> >>>>>How compatible are those assumptions with serious anisotropy
> >>>>>and the wierd Wilson plots that nucleic acids give?
> >>>>
> >>>>Not relevant
> >>>>
> >>>>
> >>>>>Note the original 1978 French and Wilson paper says:
> >>>>>"It is nevertheless important to validate this agreement for
> >>>>>each set of data independently, as the presence of atoms in
> >>>>>special positions or the existence of noncrystallographic
> >>>>>elements of symmetry (or pseudosymmetry) may abrogate the
> >>>>>application of these prior beliefs for some crystal
> >>>>>structures."
> >>>>
> >>>>It is true that such things matter when you get down to the
> >>>>nitty-gritty details of what to use as the "expected 
> distribution".
> >>>>But *all* plausible expected distributions will be non-negative
> >>>>and smooth.
> >>>>
> >>>>
> >>>>
> >>>>>Please help truncate my ignorance ...
> >>>>>
> >>>>>    Phoebe
> >>>>>
> >>>>>==========================================================
> >>>>>Phoebe A. Rice
> >>>>>Assoc. Prof., Dept. of Biochemistry & Molecular Biology
> >>>>>The University of Chicago
> >>>>>phone 773 834 1723
> >>>>>
> >>
> >>http://bmb.bsd.uchicago.edu/Faculty_and_Research/01_Faculty/01
> >>_Faculty_Alphabetically.php?faculty_id=123
> >>
> >>>>>RNA is really nifty
> >>>>>DNA is over fifty
> >>>>>We have put them
> >>>>>  both in one book
> >>>>>Please do take a
> >>>>>  really good look
> >>>>>http://www.rsc.org/shop/books/2008/9780854042722.asp
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>>--
> >>>>Ethan A Merritt
> >>>>Biomolecular Structure Center
> >>>>University of Washington, Seattle 98195-7742
> >>>>
> >>>
> >>
> > 
> > 
> > Disclaimer
> > This communication is confidential and may contain 
> privileged information intended solely for the named 
> addressee(s). It may not be used or disclosed except for the 
> purpose for which it has been sent. If you are not the 
> intended recipient you must not review, use, disclose, copy, 
> distribute or take any action in reliance upon it. If you 
> have received this communication in error, please notify 
> Astex Therapeutics Ltd by emailing 
> [log in to unmask] and destroy all copies of the 
> message and any attached documents. 
> > Astex Therapeutics Ltd monitors, controls and protects all 
> its messaging traffic in compliance with its corporate email 
> policy. The Company accepts no liability or responsibility 
> for any onward transmission or use of emails and attachments 
> having left the Astex Therapeutics domain.  Unless expressly 
> stated, opinions in this message are those of the individual 
> sender and not of Astex Therapeutics Ltd. The recipient 
> should check this email and any attachments for the presence 
> of computer viruses. Astex Therapeutics Ltd accepts no 
> liability for damage caused by any virus transmitted by this 
> email. E-mail is susceptible to data corruption, 
> interception, unauthorized amendment, and tampering, Astex 
> Therapeutics Ltd only send and receive e-mails on the basis 
> that the Company is not liable for any such alteration or any 
> consequences thereof.
> > Astex Therapeutics Ltd., Registered in England at 436 
> Cambridge Science Park, Cambridge CB4 0QA under number 3751674
> > 
> > 
> 
> 
> -- 
> 
> Bart Hazes (Associate Professor)
> Dept. of Medical Microbiology & Immunology
> University of Alberta
> 1-15 Medical Sciences Building
> Edmonton, Alberta
> Canada, T6G 2H7
> phone:  1-780-492-0042
> fax:    1-780-492-7521
> 
> 


Disclaimer
This communication is confidential and may contain privileged information intended solely for the named addressee(s). It may not be used or disclosed except for the purpose for which it has been sent. If you are not the intended recipient you must not review, use, disclose, copy, distribute or take any action in reliance upon it. If you have received this communication in error, please notify Astex Therapeutics Ltd by emailing [log in to unmask] and destroy all copies of the message and any attached documents. 
Astex Therapeutics Ltd monitors, controls and protects all its messaging traffic in compliance with its corporate email policy. The Company accepts no liability or responsibility for any onward transmission or use of emails and attachments having left the Astex Therapeutics domain.  Unless expressly stated, opinions in this message are those of the individual sender and not of Astex Therapeutics Ltd. The recipient should check this email and any attachments for the presence of computer viruses. Astex Therapeutics Ltd accepts no liability for damage caused by any virus transmitted by this email. E-mail is susceptible to data corruption, interception, unauthorized amendment, and tampering, Astex Therapeutics Ltd only send and receive e-mails on the basis that the Company is not liable for any such alteration or any consequences thereof.
Astex Therapeutics Ltd., Registered in England at 436 Cambridge Science Park, Cambridge CB4 0QA under number 3751674


Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

April 2024
March 2024
February 2024
January 2024
December 2023
November 2023
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
December 2022
November 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager