JISCMail - CCP4BB Archives

On Mon Nov 17 2014 at 10:22:56 AM Phil Evans <[log in to unmask]> wrote:

Actually Pointless knows that the INTEGRATE file is corrected for an unpolarised beam and recorrects for a synchrotron unless the wavelength is one of the home source ones. See docs. You can specify explicitly I think
Phil

Sent from my iPhone

On 17 Nov 2014, at 09:44, Graeme Winter <[log in to unmask]> wrote:

Dear Nukri,

The following is my opinion which I think is worth discussion, and are based on my understanding of what XDS does in the CORRECT step.

Firstly, I tend to find the global refinement in the CORRECT step useful for getting a good unit cell & recycling the orientation matrix etc. for reintegration. This is not related to scaling, but is useful, e.g.:

http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Optimisation#Re-INTEGRATEing_with_the_correct_spacegroup.2C_refined_geometry_and_fine-slicing_of_profiles

More relevant to the intensities: in integration the LP correction is calculated assuming an unpolarized beam - if the data are from a synchrotron these need to be corrected again for the correct polarization - something which the correct step does (obviously given this on the command-line). Pointless will also do this but assumes unless given a correct value that the beam is quite polarized. Mostly: care needs to be taken, particularly if using a wavelength which may be confused with a lab source...

I also understand that the XDS CORRECT step applies a DQE correction for Pilatus data, taking into account the geometry of the experiment, the sensor thickness & photon energy. If you have a two theta offset and are using relatively high energy (say 14 keV or so?) then this may have odd effects on your data. At detector two theta = 0 this is less of a problem. This can be a "gotcha" with processing small molecule data recorded with a little Pilatus.

Best wishes Graeme

On Fri Nov 14 2014 at 6:15:31 PM Sanishvili, Ruslan <[log in to unmask]> wrote:

Dear Graeme,

Could you elaborate on "There are also some subtleties to making (b) work properly..." some more? I have a feeling, from observing the beamline users, that many choose to use this option. It would be very helpful for them to know what are those subtleties and how to best make it work properly.
Many thanks,
Nukri

Ruslan Sanishvili (Nukri)
Macromolecular Crystallographer
GM/CA@APS
X-ray Science Division, ANL
9700 S. Cass Ave.
Lemont, IL 60439

Tel: (630)252-0665
Fax: (630)252-0667
[log in to unmask]

From: CCP4 bulletin board [[log in to unmask]] on behalf of Graeme Winter [[log in to unmask]]
Sent: Thursday, November 13, 2014 2:15 AM
To: [log in to unmask]
Subject: Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

Dear Kay

Just to comment on (e) since you say you don't know why anyone would want to do this, yet this is exactly what xia2 -3d does :o)

I use AIMLESS to merge data already scaled by XDS CORRECT or XSCALE as a way to get a report on the merging statistics which includes all of the AIMLESS analysis, and to generate harvesting files for deposition.

Like you, I look forward to studies of (a) - (e) & think of all of these (c) is by far the worst idea, from gut instinct. There are also some subtleties to making (b) work properly...

For anyone who has time on their hands & would like to do this study, be sure to consider a range of crystal symmetries as it is possible that some strategies which are "safe" in PG 422 (say) are not in PG 2.

Best wishes Graeme

On Wed Nov 12 2014 at 10:07:10 PM Kay Diederichs <[log in to unmask]> wrote:

Hi Wolfram,

it took me a while until I realized that you mean "overfitting" when you said "o-word".

You can abuse XDS in a number of ways, and I would call them "overfitting the data" although that would be using the word in a somewhat strained way: reducing WFAC1 below 1, decreasing REFLECTIONS/CORRECTION_FACTOR below 50 come to mind, but in an extended sense there are other ways: rejecting frames for no other reason than that they have low I/sigma or high Rmeas, ...

People always seem to find ways to beautify their precision indicators, but they are just fooling themselves, because rejecting data just for cosmetic reasons creates bias. In other words, they trade random error against systematic error. Guess what is worse. A deeper reason of the problem is that crystallographers have been fixated on data R-factors for decades, and have become really spoilt by this. Our science has been completely mis-lead when it comes to data statistics, and is recovering only slowly.

Concerning non-cautious use of SCALA/AIMLESS after CORRECT: actually I know of no systematic studies in this respect. But I know one thing: it is better to be critical with respect to recipes, than to follow them blindly. So I suggest the following project: compare SAD structure solution with the following routes
a) INTEGRATE -> CORRECT scaling -> SHELXD
b) INTEGRATE -> AIMLESS scaling -> SHELXD
c) INTEGRATE -> CORRECT+AIMLESS scaling -> SHELXD
d) INTEGRATE -> CORRECT but scaling switched off -> AIMLESS scaling -> SHELXD
e) INTEGRATE -> CORRECT scaling -> AIMLESS but scaling switched off -> SHELXD
and report here.
You can add XSCALE into the mix but that won't change the picture, since it does the exact same calculations for multiple datasets as CORRECT does for single datasets.
Personally, I don't understand why people would _want_ to do c),d) or e) because that's just added complexity, and additional sources of error.

I'm looking forward to the results of such studies!

Kay

On Wed, 12 Nov 2014 12:41:28 -0500, wtempel <[log in to unmask]> wrote:

>Hello Kay,
>you said the o-word, and you are familiar with the inner workings of XDS.
>Has the data-to-parameter ratio in even complex scaling models become so
>small that a doubling (worst case) of model parameters would be a serious
>concern? Could one detect such overfitting by, say, comparing (molecular)
>model R-factors between refinement against the once (CORRECT) scaled or
>twice (CORRECT+AIMLESS) scaled data?
>Thank you,
>Wolfram
>
>On Wed, Nov 12, 2014 at 10:32 AM, Kay Diederichs <
>[log in to unmask]de> wrote:
>
>> Hi Tim,
>>
>> this is incorrect.
>>
>> XSCALE determines the relative scale and B in a first step (this is what
>> you describe).
>>
>> It then, in a second step, re-determines all scale factors (exactly as
>> CORRECT does for the individual data sets), at the exact same supporting
>> points that CORRECT used. (This avoids over-fitting which would result
>> from a scaling model with different basis functions; a worry that I have
>> when people use SCALA/AIMLESS after CORRECT without taking precautions.)
>> The resulting scale factors are written to files MODPIX*.cbf, DECAY*.cbf,
>> ABSORP*.cbf for inspection.
>>
>> Thirdly, it produces statistics and writes output files.
>>
>> best,
>>
>> Kay
>>
>>
>> On Wed, 12 Nov 2014 11:22:51 +0100, Tim Gruene <[log in to unmask]>
>> wrote:
>>
>> >-----BEGIN PGP SIGNED MESSAGE-----
>> >Hash: SHA1
>> >
>> >Dear Wolfram Tempel,
>> >
>> >there might be some confusion about terms.
>> >
>> >It is correct that xscale scales several data sets together. However,
>> >in crystallography, 'merging' might be the better term for this process.
>> >
>> >Crystallographic 'Scaling' is far more complicated than 'merging'. It
>> >applies correction factors which try to make up for experimental
>> >errors in your data set. These corrections include the sigma-values,
>> >which is particularly important for experimental phasing. In that
>> >respect it can actually hamper the data quality if you
>> >(crystallographically) scale your data twice, although the effect is
>> >rather subtle.
>> >
>> >CORRECT carries out these corrections, hence CORRECT scales your data
>> >set, while XSCALE does not repeat this step - it "only" merges your
>> >data in the sense that it puts your data on a common scale. This is
>> >the application of a not too difficult mathematical formula (which is
>> >listed in the xds wiki, but I don't remember the URL).
>> >
>> >Regards,
>> >Tim
>> >
>> >On 11/11/2014 10:07 PM, Sudhir Babu Pothineni wrote:
>> >>
>> >> http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Xscale
>> >>
>> >> XSCALE
>> >> <
>> http://www.mpimf-heidelberg.mpg.de/%7Ekabsch/xds/html_doc/xscale_parameters.html
>> >
>> >>
>> >>
>> >is the scaling program of the XDS suite. It scales reflection files
>> >> (typically called XDS_ASCII.HKL) produced by XDS. Since the CORRECT
>> >> step of XDS already scales an individual dataset, XSCALE is only
>> >> /needed/ if several datasets should be scaled relative to another.
>> >> However, it does not deterioriate a dataset if it is "scaled again"
>> >> in XSCALE, since the supporting points of the scalefactors are at
>> >> the same positions in detector and batch space. The advantage of
>> >> using XSCALE for a single dataset is that the user can specify the
>> >> limits of the resolution shells.
>> >>
>> >> _Scaling with scala/aimless_
>> >>
>> >>
>> http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Scaling_with_SCALA_%28or_better:_aimless%29
>> >>
>> >>
>> >>
>> >> -Sudhir
>> >>
>> >>
>> >> *************************** Sudhir Babu Pothineni GM/CA @ APS 436D
>> >> Argonne National Laboratory 9700 S Cass Ave Argonne IL 60439
>> >>
>> >> Ph : 630 252 0672
>> >>
>> >>
>> >>
>> >>
>> >> On 11/11/14 14:42, wtempel wrote:
>> >>> Thank you Boaz. So if CORRECT can do a fully corrected scaling,
>> >>> are there no corrections that XSCALE might apply to XDS_ASCII.HKL
>> >>> data that are beyond CORRECT's capabilities? Wolfram
>> >>>
>> >>>
>> >>> On Tue, Nov 11, 2014 at 3:05 PM, Boaz Shaanan
>> >>> <[log in to unmask] <mailto:[log in to unmask]>> wrote:
>> >>>
>> >>> Hi,
>> >>>
>> >>> I actually choose the option 'constant' further down in the
>> >>> aimless gui but I guess the effect is similar to 'onlymege'.
>> >>>
>> >>> Boaz
>> >>>
>> >>> /Boaz Shaanan, Ph.D. Dept. of Life Sciences Ben-Gurion University
>> >>> of the Negev Beer-Sheva 84105 Israel
>> >>>
>> >>> E-mail: [log in to unmask] <mailto:[log in to unmask]> Phone:
>> >>> 972-8-647-2220 Skype: boaz.shaanan Fax: 972-8-647-2992 or
>> >>> 972-8-646-1710 / // // /
>> >>>
>> >>> /
>> >>>
>> >>>
>> ------------------------------------------------------------------------
>> >>>
>> >>>
>> >*From:* CCP4 bulletin board [[log in to unmask]
>> >>> <mailto:[log in to unmask]>] on behalf of wtempel
>> >>> [[log in to unmask] <mailto:[log in to unmask]>] *Sent:* Tuesday,
>> >>> November 11, 2014 9:50 PM *To:* [log in to unmask]
>> >>> <mailto:[log in to unmask]> *Subject:* [ccp4bb] To scale or
>> >>> not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS
>> >>>
>> >>> Hello all, in a discussion
>> >>>
>> >>> <
>> https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1307&L=CCP4BB&H=1&P=186901
>> >
>> >>>
>> >>>
>> >>>
>> >on this board, Kay Diederichs questioned the effect of scaling
>> >>> data in AIMLESS after prior scaling in XDS (CORRECT). I
>> >>> understand that the available alternatives in this work flow are
>> >>> to specify the AIMLESS ‘onlymerge’ command, or not. Are there any
>> >>> arguments for the preference of one alternative over the other?
>> >>> Thank you for your insights, Wolfram Tempel
>> >>>
>> >>>
>> >>>
>> >>>
>> >>
>> >>
>> >
>> >- --
>> >- --
>> >Dr Tim Gruene
>> >Institut fuer anorganische Chemie
>> >Tammannstr. 4
>> >D-37077 Goettingen
>> >
>> >GPG Key ID = A46BEE1A
>> >
>> >-----BEGIN PGP SIGNATURE-----
>> >Version: GnuPG v1.4.12 (GNU/Linux)
>> >
>> >iD8DBQFUYzT7UxlJ7aRr7hoRAuO2AJ9P3kJAjP+8wWjXRvkZwgDs9UOo3ACfb1En
>> >67VgyyqCTX6j5vOz3xMVwqE=
>> >=ooTC
>> >-----END PGP SIGNATURE-----
>>
>