Print

Print



Hi, I don't buy that, at least not speaking generally about curve-fitting.  Extend your argument to 2 observations.  If the observations are exact and assuming that the linear model is the correct one then the parameters of the model will be determined exactly.  Now suppose that the observations are imprecise: then the parameters will also be imprecise.  Now suppose we have lots more imprecise observations: this will improve the precision of the parameters.  This is exactly what we do in crystallography: compensate for the lack of precision of the observations by measuring more observations than we would need if the observations were precise.  This is of course called 'over-determination', i.e. we make the o/p ratio as high as is practically feasible to compensate for imprecise observations.

In practice of course it's not quite as simple as that - note the huge 'if' above "assuming that the linear model is the correct one".  What if the linear model is not correct, suppose say the relationship between model and observations is actually quadratic, or a higher order polynomial?  This is the kind of thing that happens in MX: it's not the errors in the observations that kill you, it's the errors in the model (that's why we get R values of ~ 0.1 to 0.2, not < 0.05 as you would expect if the only errors were in the observations).

Note that I really do mean "errors in the model", i.e. the wrong model, such using a linear equation when it should be quadratic, or using isotropic B factors when they are actually anisotropic, as well as a host of other simplistic assumptions that we make about the nature of crystal structures.  I do NOT mean "errors in the parameters of the model" (which some people seem to think it means!): the errors in the parameters of the model have no meaning if they're the wrong parameters in the first place!  So again over-determination is essential to compensate for the errors in the model in order to minimise over-fitting, where the parameters take up the errors from having an incorrect model.

So you're right in saying that in the particular case of MX where there are big errors in the models, it's not so much the precision of the observations that matters.

Cheers

-- Ian


On 4 February 2016 at 20:28, Ethan A Merritt <[log in to unmask]> wrote:

On Thursday, 04 February, 2016 20:09:30 Keller, Jacob wrote:

> It seems to me that the oft-rehearsed requirement of certain data:parameter

> ratios depends highly on the precision of the measurements (nothing novel

> here), so a measure of "information," rather than either a simple ratio or

> an empirically-based rule of thumb, might be the best guide in deciding

> which parameters to model.

 

This is not true. The desire for a large observation:parameter ratio has

nothing to do with the precision of the observations.

Consider: a single observation is insufficient to fit a line (ax+b)

no matter how precise that observation may be.

 

Ethan

--

Ethan A Merritt

Biomolecular Structure Center, K-428 Health Sciences Bldg

MS 357742, University of Washington, Seattle 98195-7742