Dear Graeme,
good that you set this straight.
I consider getting the statistics output from AIMLESS is a perfectly valid reason for going e), and as long as this is well-tested (which I'd bet in case of xia2) it's ok. There is one issue I can see: 99% (obviously my guess could be wrong; just an estimate based on reading the Methods section of papers) of xia2 -3d users are not aware that their data then are _not_ scaled by AIMLESS. They see the AIMLESS tables and think "so it must have been AIMLESS that scaled the data". And they publish and PDB-deposit their misconception. This is how the misunderstanding spreads, which is then why I get asked "can CORRECT scale a data set?" and other misunderstandings along these lines ...
best,
Kay
On Thu, 13 Nov 2014 08:15:12 +0000, Graeme Winter <[log in to unmask]> wrote:
>Dear Kay
>
>Just to comment on (e) since you say you don't know why anyone would want
>to do this, yet this is exactly what xia2 -3d does :o)
>
>I use AIMLESS to merge data already scaled by XDS CORRECT or XSCALE as a
>way to get a report on the merging statistics which includes all of the
>AIMLESS analysis, and to generate harvesting files for deposition.
>
>Like you, I look forward to studies of (a) - (e) & think of all of these
>(c) is by far the worst idea, from gut instinct. There are also some
>subtleties to making (b) work properly...
>
>For anyone who has time on their hands & would like to do this study, be
>sure to consider a range of crystal symmetries as it is possible that some
>strategies which are "safe" in PG 422 (say) are not in PG 2.
>
>Best wishes Graeme
>
>
>
>On Wed Nov 12 2014 at 10:07:10 PM Kay Diederichs <
>[log in to unmask]> wrote:
>
>> Hi Wolfram,
>>
>> it took me a while until I realized that you mean "overfitting" when you
>> said "o-word".
>>
>> You can abuse XDS in a number of ways, and I would call them "overfitting
>> the data" although that would be using the word in a somewhat strained way:
>> reducing WFAC1 below 1, decreasing REFLECTIONS/CORRECTION_FACTOR below 50
>> come to mind, but in an extended sense there are other ways: rejecting
>> frames for no other reason than that they have low I/sigma or high Rmeas,
>> ...
>>
>> People always seem to find ways to beautify their precision indicators,
>> but they are just fooling themselves, because rejecting data just for
>> cosmetic reasons creates bias. In other words, they trade random error
>> against systematic error. Guess what is worse. A deeper reason of the
>> problem is that crystallographers have been fixated on data R-factors for
>> decades, and have become really spoilt by this. Our science has been
>> completely mis-lead when it comes to data statistics, and is recovering
>> only slowly.
>>
>> Concerning non-cautious use of SCALA/AIMLESS after CORRECT: actually I
>> know of no systematic studies in this respect. But I know one thing: it is
>> better to be critical with respect to recipes, than to follow them blindly.
>> So I suggest the following project: compare SAD structure solution with the
>> following routes
>> a) INTEGRATE -> CORRECT scaling -> SHELXD
>> b) INTEGRATE -> AIMLESS scaling -> SHELXD
>> c) INTEGRATE -> CORRECT+AIMLESS scaling -> SHELXD
>> d) INTEGRATE -> CORRECT but scaling switched off -> AIMLESS scaling ->
>> SHELXD
>> e) INTEGRATE -> CORRECT scaling -> AIMLESS but scaling switched off ->
>> SHELXD
>> and report here.
>> You can add XSCALE into the mix but that won't change the picture, since
>> it does the exact same calculations for multiple datasets as CORRECT does
>> for single datasets.
>> Personally, I don't understand why people would _want_ to do c),d) or e)
>> because that's just added complexity, and additional sources of error.
>>
>> I'm looking forward to the results of such studies!
>>
>> Kay
>>
>>
>> On Wed, 12 Nov 2014 12:41:28 -0500, wtempel <[log in to unmask]> wrote:
>>
>> >Hello Kay,
>> >you said the o-word, and you are familiar with the inner workings of XDS.
>> >Has the data-to-parameter ratio in even complex scaling models become so
>> >small that a doubling (worst case) of model parameters would be a serious
>> >concern? Could one detect such overfitting by, say, comparing (molecular)
>> >model R-factors between refinement against the once (CORRECT) scaled or
>> >twice (CORRECT+AIMLESS) scaled data?
>> >Thank you,
>> >Wolfram
>> >
>> >On Wed, Nov 12, 2014 at 10:32 AM, Kay Diederichs <
>> >[log in to unmask]> wrote:
>> >
>> >> Hi Tim,
>> >>
>> >> this is incorrect.
>> >>
>> >> XSCALE determines the relative scale and B in a first step (this is what
>> >> you describe).
>> >>
>> >> It then, in a second step, re-determines all scale factors (exactly as
>> >> CORRECT does for the individual data sets), at the exact same supporting
>> >> points that CORRECT used. (This avoids over-fitting which would result
>> >> from a scaling model with different basis functions; a worry that I have
>> >> when people use SCALA/AIMLESS after CORRECT without taking precautions.)
>> >> The resulting scale factors are written to files MODPIX*.cbf,
>> DECAY*.cbf,
>> >> ABSORP*.cbf for inspection.
>> >>
>> >> Thirdly, it produces statistics and writes output files.
>> >>
>> >> best,
>> >>
>> >> Kay
>> >>
>> >>
>> >> On Wed, 12 Nov 2014 11:22:51 +0100, Tim Gruene <[log in to unmask]
>> >
>> >> wrote:
>> >>
>> >> >-----BEGIN PGP SIGNED MESSAGE-----
>> >> >Hash: SHA1
>> >> >
>> >> >Dear Wolfram Tempel,
>> >> >
>> >> >there might be some confusion about terms.
>> >> >
>> >> >It is correct that xscale scales several data sets together. However,
>> >> >in crystallography, 'merging' might be the better term for this
>> process.
>> >> >
>> >> >Crystallographic 'Scaling' is far more complicated than 'merging'. It
>> >> >applies correction factors which try to make up for experimental
>> >> >errors in your data set. These corrections include the sigma-values,
>> >> >which is particularly important for experimental phasing. In that
>> >> >respect it can actually hamper the data quality if you
>> >> >(crystallographically) scale your data twice, although the effect is
>> >> >rather subtle.
>> >> >
>> >> >CORRECT carries out these corrections, hence CORRECT scales your data
>> >> >set, while XSCALE does not repeat this step - it "only" merges your
>> >> >data in the sense that it puts your data on a common scale. This is
>> >> >the application of a not too difficult mathematical formula (which is
>> >> >listed in the xds wiki, but I don't remember the URL).
>> >> >
>> >> >Regards,
>> >> >Tim
>> >> >
>> >> >On 11/11/2014 10:07 PM, Sudhir Babu Pothineni wrote:
>> >> >>
>> >> >> http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Xscale
>> >> >>
>> >> >> XSCALE
>> >> >> <
>> >> http://www.mpimf-heidelberg.mpg.de/%7Ekabsch/xds/html_doc/
>> xscale_parameters.html
>> >> >
>> >> >>
>> >> >>
>> >> >is the scaling program of the XDS suite. It scales reflection files
>> >> >> (typically called XDS_ASCII.HKL) produced by XDS. Since the CORRECT
>> >> >> step of XDS already scales an individual dataset, XSCALE is only
>> >> >> /needed/ if several datasets should be scaled relative to another.
>> >> >> However, it does not deterioriate a dataset if it is "scaled again"
>> >> >> in XSCALE, since the supporting points of the scalefactors are at
>> >> >> the same positions in detector and batch space. The advantage of
>> >> >> using XSCALE for a single dataset is that the user can specify the
>> >> >> limits of the resolution shells.
>> >> >>
>> >> >> _Scaling with scala/aimless_
>> >> >>
>> >> >>
>> >> http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/
>> Scaling_with_SCALA_%28or_better:_aimless%29
>> >> >>
>> >> >>
>> >> >>
>> >> >> -Sudhir
>> >> >>
>> >> >>
>> >> >> *************************** Sudhir Babu Pothineni GM/CA @ APS 436D
>> >> >> Argonne National Laboratory 9700 S Cass Ave Argonne IL 60439
>> >> >>
>> >> >> Ph : 630 252 0672
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >> On 11/11/14 14:42, wtempel wrote:
>> >> >>> Thank you Boaz. So if CORRECT can do a fully corrected scaling,
>> >> >>> are there no corrections that XSCALE might apply to XDS_ASCII.HKL
>> >> >>> data that are beyond CORRECT's capabilities? Wolfram
>> >> >>>
>> >> >>>
>> >> >>> On Tue, Nov 11, 2014 at 3:05 PM, Boaz Shaanan
>> >> >>> <[log in to unmask] <mailto:[log in to unmask]>> wrote:
>> >> >>>
>> >> >>> Hi,
>> >> >>>
>> >> >>> I actually choose the option 'constant' further down in the
>> >> >>> aimless gui but I guess the effect is similar to 'onlymege'.
>> >> >>>
>> >> >>> Boaz
>> >> >>>
>> >> >>> /Boaz Shaanan, Ph.D. Dept. of Life Sciences Ben-Gurion University
>> >> >>> of the Negev Beer-Sheva 84105 Israel
>> >> >>>
>> >> >>> E-mail: [log in to unmask] <mailto:[log in to unmask]> Phone:
>> >> >>> 972-8-647-2220 Skype: boaz.shaanan Fax: 972-8-647-2992 or
>> >> >>> 972-8-646-1710 / // // /
>> >> >>>
>> >> >>> /
>> >> >>>
>> >> >>>
>> >> ------------------------------------------------------------
>> ------------
>> >> >>>
>> >> >>>
>> >> >*From:* CCP4 bulletin board [[log in to unmask]
>> >> >>> <mailto:[log in to unmask]>] on behalf of wtempel
>> >> >>> [[log in to unmask] <mailto:[log in to unmask]>] *Sent:* Tuesday,
>> >> >>> November 11, 2014 9:50 PM *To:* [log in to unmask]
>> >> >>> <mailto:[log in to unmask]> *Subject:* [ccp4bb] To scale or
>> >> >>> not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS
>> >> >>>
>> >> >>> Hello all, in a discussion
>> >> >>>
>> >> >>> <
>> >> https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1307&L=
>> CCP4BB&H=1&P=186901
>> >> >
>> >> >>>
>> >> >>>
>> >> >>>
>> >> >on this board, Kay Diederichs questioned the effect of scaling
>> >> >>> data in AIMLESS after prior scaling in XDS (CORRECT). I
>> >> >>> understand that the available alternatives in this work flow are
>> >> >>> to specify the AIMLESS ‘onlymerge’ command, or not. Are there any
>> >> >>> arguments for the preference of one alternative over the other?
>> >> >>> Thank you for your insights, Wolfram Tempel
>> >> >>>
>> >> >>>
>> >> >>>
>> >> >>>
>> >> >>
>> >> >>
>> >> >
>> >> >- --
>> >> >- --
>> >> >Dr Tim Gruene
>> >> >Institut fuer anorganische Chemie
>> >> >Tammannstr. 4
>> >> >D-37077 Goettingen
>> >> >
>> >> >GPG Key ID = A46BEE1A
>> >> >
>> >> >-----BEGIN PGP SIGNATURE-----
>> >> >Version: GnuPG v1.4.12 (GNU/Linux)
>> >> >
>> >> >iD8DBQFUYzT7UxlJ7aRr7hoRAuO2AJ9P3kJAjP+8wWjXRvkZwgDs9UOo3ACfb1En
>> >> >67VgyyqCTX6j5vOz3xMVwqE=
>> >> >=ooTC
>> >> >-----END PGP SIGNATURE-----
>> >>
>> >
>>
>
|