Hi Mark
I think you need to distinguish between the mechanics of the
refinement software on the one hand and the effect on the statistics
on the other. I think you are referring to the former, in other words
in the software the restraints are as you say treated exactly like the
X-ray observations; they appear to augment the observations, and
clearly do not reduce the *actual* number of parameters in any way
(unlike constraints which do). Unfortunately this train of thought
leads nowhere because even though in the software restraints and
observations appear to be equivalent, 1 restraint is in no way
*statistically* equivalent to 1 X-ray observation. We can make
progress in understanding the statistics however if we consider the
*effective* number of parameters which turns out to be (see the paper
that Ed referred to for the proof):
m_eff = m - r + Drest
where m is the actual number of parameters, r is the number of
restraints and Drest is a kind of correction for the fact that 1
parameter is not equivalent to 1 restraint (Drest depends on r in a
complicated way; it's actually the contribution of the restraints to
the least-squares residual, or equivalently to the negative
log-likelihood, so 'good' restraints increase Drest less than 'bad'
ones). In other words adding restraints does indeed have the effect
of reducing the *effective* number of parameters (though not 1-for-1
since Drest also varies as you add restraints). We need m_eff in
order to compute the ratio (no of observations) / (effective no of
parameters).
The expected Rfree/R (i.e. the expectation is predicated on the
assumption that the parameter refinement is at a global minimum whose
position in parameter space is a function of the weights you used) is
then sqrt((f + m_eff) / (f - m_eff)), where f is the size of the
working set. This can be written as sqrt((x+1) / (x-1)) where x = f /
m_eff i.e. the effective obs/param ratio. This shows the direct
relationship between <Rfree/R> and the effective obs/param ratio; for
example you can see what happens as x tends towards unity on the one
hand and towards infinity on the other!
Cheers
-- Ian
On Fri, Apr 9, 2010 at 9:31 PM, Mark J. van Raaij <[log in to unmask]> wrote:
> Hi All,
> in a paper (which I can't locate now...) which I read recently it was stated
> that restraints do not reduce the number of parameters, rather they augment
> the number of data points (so strong restraints are like strong data, weak
> restraints weak data...). Only strict NCS constraints, where the copies have
> to stay exactly the same, would reduce the number of parameters. Both
> augment the data to parameter ratio, of course. I really liked this
> explanation.
> Mark
> On 9 April 2010 21:54, Ian Tickle <[log in to unmask]> wrote:
>>
>> Hi Ed
>>
>> It's very difficult to deal theoretically with NCS because, unlike
>> bond lengths where the uncertainties are known a priori (at least in
>> principle), with NCS you don't know the uncertainties a priori, if you
>> see what I mean (rather like unknown unknowns!). In other words the
>> optimal weights and hence the effective number of parameters will
>> depend on the exactness of the NCS. In practice you can of course
>> determine the weights by minimising Rfree w.r.t.them. So I think it
>> would be quite difficult to do what you are proposing, i.e. to
>> disentangle the effects of the obs/param ratio and any effect of
>> correlation of the working & test sets. Interesting problem though!
>>
>> BTW I think you are mis-quoting the formula in the paper, it should be
>> <Rfree/R> = sqrt((Nobs+Nparam)/(Nobs-Nparam)).
>>
>> In other words R is reduced below its expected value in the absence of
>> random error, by overfitting the errors in the working set, but people
>> tend to forget that the test set also has, on average, random errors
>> of the same magnitude which tend to increase Rfree *above* its
>> expected value.
>>
>> Cheers
>>
>> -- Ian
>>
>>
>> On Fri, Apr 9, 2010 at 8:25 PM, Edward A. Berry <[log in to unmask]>
>> wrote:
>> > Has anyone looked theoretically at how ncs-restraints affect
>> > the expected Rfree/R ratio?
>> >
>> > Tickle et al., Acta Cryst. (1998). D54, 547-557
>> > concluded Rfree/R = sqrt(Nobs/(Nobs-Nparam)) .
>> > He suggested that, with restrained refinement of coordinates
>> > plus individual isotropic B-factors, the effective number
>> > of parameters per atom is two. If we add strong N-fold NCS
>> > restraints on coordinates and B-factor, does that effectively
>> > reduce the number of parameters by a factor of N?
>> > Giving 2/N for parameters per atom?
>> >
>> > I'm curious how much of the drop in the r-free ratio observed
>> > on enforcing NCS is due to the reduction in the effective
>> > number of parameters, and how much is due to linking reflections
>> > in the free set with the working set. Given an expression to
>> > predict the effect of reducing number of parameters, seeing
>> > how much of the actual drop in Rfree/R it accounts for
>> > would let us see how severe the linkage problem is.
>> >
>> > Ed
>> >
>
>
>
> --
> Mark J van Raaij
> http://webspersoais.usc.es/mark.vanraaij
> http://www.ibmb.csic.es
>
|