JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for CCP4BB Archives


CCP4BB Archives

CCP4BB Archives


CCP4BB@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

CCP4BB Home

CCP4BB Home

CCP4BB  November 2014

CCP4BB November 2014

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Re: To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

From:

Kay Diederichs <[log in to unmask]>

Reply-To:

Kay Diederichs <[log in to unmask]>

Date:

Wed, 12 Nov 2014 22:02:30 +0000

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (204 lines)

Hi Wolfram,

it took me a while until I realized that you mean "overfitting" when you said "o-word".

You can abuse XDS in a number of ways, and I would call them "overfitting the data" although that would be using the word in a somewhat strained way: reducing WFAC1 below 1, decreasing REFLECTIONS/CORRECTION_FACTOR below 50 come to mind, but in an extended sense there are other ways: rejecting frames for no other reason than that they have low I/sigma or high Rmeas, ...

People always seem to find ways to beautify their precision indicators, but they are just fooling themselves, because rejecting data just for cosmetic reasons creates bias. In other words, they trade random error against systematic error. Guess what is worse. A deeper reason of the problem is that crystallographers have been fixated on data R-factors for decades, and have become really spoilt by this. Our science has been completely mis-lead when it comes to data statistics, and is recovering only slowly.

Concerning non-cautious use of SCALA/AIMLESS after CORRECT: actually I know of no systematic studies in this respect. But I know one thing: it is better to be critical with respect to recipes, than to follow them blindly. So I suggest the following project: compare SAD structure solution with the following routes
a) INTEGRATE -> CORRECT scaling  -> SHELXD
b) INTEGRATE -> AIMLESS scaling -> SHELXD
c) INTEGRATE -> CORRECT+AIMLESS scaling -> SHELXD
d) INTEGRATE -> CORRECT but scaling switched off -> AIMLESS scaling -> SHELXD
e) INTEGRATE -> CORRECT scaling -> AIMLESS but scaling switched off -> SHELXD
and report here.
You can add XSCALE into the mix but that won't change the picture, since it does the exact same calculations for multiple datasets as CORRECT does for single datasets.
Personally, I don't understand why people would _want_ to do c),d) or e) because that's just added complexity, and additional sources of error. 

I'm looking forward to the results of such studies!

Kay


On Wed, 12 Nov 2014 12:41:28 -0500, wtempel <[log in to unmask]> wrote:

>Hello Kay,
>you said the o-word, and you are familiar with the inner workings of XDS.
>Has the data-to-parameter ratio in even complex scaling models become so
>small that a doubling (worst case) of model parameters would be a serious
>concern? Could one detect such overfitting by, say, comparing (molecular)
>model R-factors between refinement against the once (CORRECT) scaled or
>twice (CORRECT+AIMLESS) scaled data?
>Thank you,
>Wolfram
>
>On Wed, Nov 12, 2014 at 10:32 AM, Kay Diederichs <
>[log in to unmask]> wrote:
>
>> Hi Tim,
>>
>> this is incorrect.
>>
>> XSCALE determines the relative scale and B in a first step (this is what
>> you describe).
>>
>> It then, in a second step, re-determines all scale factors (exactly as
>> CORRECT does for the individual data sets), at the exact same supporting
>> points that CORRECT used.  (This avoids over-fitting which would result
>> from a scaling model with different basis functions; a worry that I have
>> when people use SCALA/AIMLESS after CORRECT without taking precautions.)
>> The resulting scale factors are written to files MODPIX*.cbf, DECAY*.cbf,
>> ABSORP*.cbf for inspection.
>>
>> Thirdly, it produces statistics and writes output files.
>>
>> best,
>>
>> Kay
>>
>>
>> On Wed, 12 Nov 2014 11:22:51 +0100, Tim Gruene <[log in to unmask]>
>> wrote:
>>
>> >-----BEGIN PGP SIGNED MESSAGE-----
>> >Hash: SHA1
>> >
>> >Dear Wolfram Tempel,
>> >
>> >there might be some confusion about terms.
>> >
>> >It is correct that xscale scales several data sets together. However,
>> >in crystallography, 'merging' might be the better term for this process.
>> >
>> >Crystallographic 'Scaling' is far more complicated than 'merging'. It
>> >applies correction factors which try to make up for experimental
>> >errors in your data set. These corrections include the sigma-values,
>> >which is particularly important for experimental phasing. In that
>> >respect it can actually hamper the data quality if you
>> >(crystallographically) scale your data twice, although the effect is
>> >rather subtle.
>> >
>> >CORRECT carries out these corrections, hence CORRECT scales your data
>> >set, while XSCALE does not repeat this step - it "only" merges your
>> >data in the sense that it puts your data on a common scale. This is
>> >the application of a not too difficult mathematical formula (which is
>> >listed in the xds wiki, but I don't remember the URL).
>> >
>> >Regards,
>> >Tim
>> >
>> >On 11/11/2014 10:07 PM, Sudhir Babu Pothineni wrote:
>> >>
>> >> http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Xscale
>> >>
>> >> XSCALE
>> >> <
>> http://www.mpimf-heidelberg.mpg.de/%7Ekabsch/xds/html_doc/xscale_parameters.html
>> >
>> >>
>> >>
>> >is the scaling program of the XDS suite. It scales reflection files
>> >> (typically called XDS_ASCII.HKL) produced by XDS. Since the CORRECT
>> >> step of XDS already scales an individual dataset, XSCALE is only
>> >> /needed/ if several datasets should be scaled relative to another.
>> >> However, it does not deterioriate a dataset if it is "scaled again"
>> >> in XSCALE, since the supporting points of the scalefactors are at
>> >> the same positions in detector and batch space. The advantage of
>> >> using XSCALE for a single dataset is that the user can specify the
>> >> limits of the resolution shells.
>> >>
>> >> _Scaling with scala/aimless_
>> >>
>> >>
>> http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Scaling_with_SCALA_%28or_better:_aimless%29
>> >>
>> >>
>> >>
>> >> -Sudhir
>> >>
>> >>
>> >> *************************** Sudhir Babu Pothineni GM/CA @ APS 436D
>> >> Argonne National Laboratory 9700 S Cass Ave Argonne IL 60439
>> >>
>> >> Ph : 630 252 0672
>> >>
>> >>
>> >>
>> >>
>> >> On 11/11/14 14:42, wtempel wrote:
>> >>> Thank you Boaz. So if CORRECT can do a fully corrected scaling,
>> >>> are there no corrections that XSCALE might apply to XDS_ASCII.HKL
>> >>> data that are beyond CORRECT's capabilities? Wolfram
>> >>>
>> >>>
>> >>> On Tue, Nov 11, 2014 at 3:05 PM, Boaz Shaanan
>> >>> <[log in to unmask] <mailto:[log in to unmask]>> wrote:
>> >>>
>> >>> Hi,
>> >>>
>> >>> I actually choose the option 'constant' further down in the
>> >>> aimless gui but I guess the effect is similar to 'onlymege'.
>> >>>
>> >>> Boaz
>> >>>
>> >>> /Boaz Shaanan, Ph.D. Dept. of Life Sciences Ben-Gurion University
>> >>> of the Negev Beer-Sheva 84105 Israel
>> >>>
>> >>> E-mail: [log in to unmask] <mailto:[log in to unmask]> Phone:
>> >>> 972-8-647-2220  Skype: boaz.shaanan Fax:   972-8-647-2992 or
>> >>> 972-8-646-1710 / // // /
>> >>>
>> >>> /
>> >>>
>> >>>
>> ------------------------------------------------------------------------
>> >>>
>> >>>
>> >*From:* CCP4 bulletin board [[log in to unmask]
>> >>> <mailto:[log in to unmask]>] on behalf of wtempel
>> >>> [[log in to unmask] <mailto:[log in to unmask]>] *Sent:* Tuesday,
>> >>> November 11, 2014 9:50 PM *To:* [log in to unmask]
>> >>> <mailto:[log in to unmask]> *Subject:* [ccp4bb] To scale or
>> >>> not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS
>> >>>
>> >>> Hello all, in a discussion
>> >>>
>> >>> <
>> https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1307&L=CCP4BB&H=1&P=186901
>> >
>> >>>
>> >>>
>> >>>
>> >on this board, Kay Diederichs questioned the effect of scaling
>> >>> data in AIMLESS after prior scaling in XDS (CORRECT). I
>> >>> understand that the available alternatives in this work flow are
>> >>> to specify the AIMLESS ‘onlymerge’ command, or not. Are there any
>> >>> arguments for the preference of one alternative over the other?
>> >>> Thank you for your insights, Wolfram Tempel
>> >>>
>> >>> ​
>> >>>
>> >>>
>> >>
>> >>
>> >
>> >- --
>> >- --
>> >Dr Tim Gruene
>> >Institut fuer anorganische Chemie
>> >Tammannstr. 4
>> >D-37077 Goettingen
>> >
>> >GPG Key ID = A46BEE1A
>> >
>> >-----BEGIN PGP SIGNATURE-----
>> >Version: GnuPG v1.4.12 (GNU/Linux)
>> >
>> >iD8DBQFUYzT7UxlJ7aRr7hoRAuO2AJ9P3kJAjP+8wWjXRvkZwgDs9UOo3ACfb1En
>> >67VgyyqCTX6j5vOz3xMVwqE=
>> >=ooTC
>> >-----END PGP SIGNATURE-----
>>
>

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

April 2024
March 2024
February 2024
January 2024
December 2023
November 2023
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
December 2022
November 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager