James Holton schrieb:
> Frank von Delft wrote:
>>> So, what statistic do we want to look at? That depends on what you
>>> are trying to do with the data. There is no way for Phil to know
>>> this, so it is good that he prints out lots of different
>>> statistics. That said, when talking about the data quality
>>> requirements for structure solution by MAD/SAD, I suggest looking at
>>> I/sigma(I) where:
>>> I - merged intensity (proportional to photons) assigned to a
>>> reciprocal lattice point (hkl index)
>> Does ANY program print this out...?
> SCALA calls this "Mn(I/sd)". Sounds like d*TREK calls it "I/sig avg".
> With HKL you compute it "by hand" from the average I and average
> "error". Not sure about XDS...
>
XDS, like SCALA and d*TREK, gives both quantities, but in different tables.
The unaveraged I/Sigma are in a table that is fine-grained in terms of
resolution, at the beginning of CORRECT.LP. The Sigma values in that
table are corrected to match the RMS scatter, as Phil explained for SCALA.
The table that has information about the averaged data (suitably
weighted) is repeated several times. It is less fine-grained in
resolution (9 shells, and overall). [if a user wants this table in
fine-grained form, s/he can use XSCALE].
The way the tables are printed is the same for both types of tables:
at first the definitions of the quantities in the table are given, and
then the table itself is printed:
Specifically, the heading of the table which talks about the unaveraged
data looks like this:
I/Sigma = mean intensity/Sigma of a reflection in shell
Chi^2 = goodness of fit between sample variances of
symmetry-related intensities and their errors
(Chi^2 = 1 for perfect agreement)
R-FACTOR
observed = (SUM(ABS(I(h,i)-I(h))))/(SUM(I(h,i)))
expected = expected R-FACTOR derived from Sigma(I)
NUMBER = number of reflections in resolution shell
used for calculation of R-FACTOR
ACCEPTED = number of accepted reflections
REJECTED = number of rejected reflections (MISFITS),
recognized by comparison with symmetry-related
reflections.
and then the table itself is:
RESOLUTION RANGE I/Sigma Chi^2 R-FACTOR R-FACTOR NUMBER ACCEPTED
REJECTED
observed expected
39.660 19.587 8.23 0.96 6.36 7.12 929 940
75
19.587 14.780 7.39 0.88 5.94 7.46 1956 1959
66
.... (many resolution shells deleted for brevity)
====
and later it gives the table for the averaged intensities with heading
R-FACTOR
observed = (SUM(ABS(I(h,i)-I(h))))/(SUM(I(h,i)))
expected = expected R-FACTOR derived from Sigma(I)
COMPARED = number of reflections used for calculating R-FACTOR
I/SIGMA = mean of intensity/Sigma(I) of unique reflections
(after merging symmetry-related observations)
Sigma(I) = standard deviation of reflection intensity I
estimated from sample statistics
R-meas = redundancy independent R-factor (intensities)
Rmrgd-F = quality of amplitudes (F) of this data set
For definition of R-meas and Rmrgd-F see
Diederichs & Karplus (1997), Nature Struct. Biol. 4, 269-275.
(rest of heading deleted for brevity)
and the table itself is
SUBSET OF INTENSITY DATA WITH SIGNAL/NOISE >= -3.0 AS FUNCTION OF
RESOLUTION
RESOLUTION NUMBER OF REFLECTIONS COMPLETENESS R-FACTOR
R-FACTOR COMPARED I/SIGMA R-meas Rmrgd-F
Anomal SigAno Nano
LIMIT OBSERVED UNIQUE POSSIBLE OF DATA observed expected
Corr
6.66 12698 5958 10069 59.2% 5.3%
6.7% 11577 10.55 6.8% 5.5%
-27% 0.740 527
4.74 22569 11140 17519 63.6% 7.3%
7.8% 19592 8.24 9.5% 9.1%
-25% 0.734 629
3.88 28199 14683 22445 65.4% 7.9%
7.7% 23437 7.88 10.3% 10.6%
-31% 0.769 449
3.37 34407 17986 26530 67.8% 12.3%
12.0% 28131 5.25 16.1% 20.6%
-19% 0.777 351
3.01 39636 20921 29958 69.8% 22.7%
23.3% 31896 3.08 29.8% 42.6%
-12% 0.644 211
(rest deleted for brevity)
So, the program indicates quite clearly what the statistics refer to.
My personal experience is that very few people actually read the heading
of the tables, but there is little one can do about this.
HTH,
Kay
|