Print

Print


Hi Tim,

your examples are valid and valuable, and clearly exemplify existing problems, limitations as well as common misconceptions.

However, if you follow mathematics and strict definitions thereof, then crystallographic structure refinement is nothing but an optimization problem that, fundamentally, to be defined requires: a) definition of model parameterization, b) definition of a function that relates experimental data and model parameters, and c) definition of a method that changes model parameters in a such a way that optimizes (most of the time minimizes) the chosen (at step "b") function.

Please don't think that I've just made up or invented these "a)-b)-c)" steps above.. In fact, this has been published, for example, in 
Acta Cryst. (1985). A41, 327-333, 
and then reiterated using modern jargon, for example, in 
Acta Cryst. (2012). D68, 352-367.

(I say "for example" above just to stick to the context and also point out that you can find more examples in crystallographic literature as well as in totally different disciplines such as economics, aerospace science etc.)

Anyways, once all the above (a-b-c) are set and defined, then your only goal is as "simple" as finding the global minimum of the function that you have chosen to optimize.

Anything else beyond that are either technical details or various inefficiencies related to improper model parameterization, improper target choice or using limited optimization tool.

All the best,
Pavel


On Fri, Nov 28, 2014 at 11:40 AM, Tim Gruene <[log in to unmask]> wrote:
Dear Pavel,

there is a beautiful paper called 'Where freedom is given, liberties are
taken' by Kleywegt and Jones, but also a wide variety of articles that
(fortunately) fought hard for the introduction of Rfree to the
(macro-)crystallographic community.

In there is mentioned the threading of an amino acid chain backwards
into the density achieving (by refinement) a lower R-value than the
original one. Since this was achieved with refinement, the former
structure was closer to the global minimum than the latter one.
Apparently none of these authors had an idea how to modify the target
function so that this would not happen - whyfore they suggested to use
cross validation to avoid it.

If you don't like this line of thought, I can offer a different one:

there is a vast number of sets of parameters that ideally fit your data:
fill your asymmetric unit randomly with atoms so that your data to
parameter ratio is 1 or lower. Refine unrestrained and your are going to
end up with an R-value of 0. For unrestrained refinement, the formula
for the R-value corresponds (maybe not for maximum likelhood based
target functions, you may have to do some translation here) to the
target function, which  usually has a lower bound of zero, hence this
vast number of "structures" all reached the global minimum. Note that
the deposited structure has an R value much greater than 0, i.e. it is
far away from the global minimum.

In order to improve the situation, one modifies the target function by
adding restraints. They increase the target value of all "structures",
but in general those for the arbitrary solutions increase so much more
than that for an acceptable solution that most of those are lifted above
that of an acceptable solution.
As an example, one of the structures for the yeast polymerase I contains
about 34,500 atoms, i.e. the target function is minimised in a 138,000
dimensional space. I don't think there is a proof that any set of
restraints is ever so ideal that all false solutions are lifted above
the target value of the accepted solution. In fact, without being able
to proove it, I doubt that this the case, which lead me to the below
claim that we don,t necessarily want to reach the global minimum of the
target function.

Of course an acceptable structure actually may have a target value
representing a global minimum, but I don't think this is always true.

Best,
Tim

On 11/28/2014 05:42 PM, Pavel Afonine wrote:
> Hi Tim,
>
> you don't necessarily want to find the global minimum (...)
>
>
> this contradicts the definition of crystallographic structure refinement.
> If finding the global minimum is not what you ultimately want then either
> the refinement target or model parameterization are poor.
>
> Clearly, given complexity of refinement target function profile (in case of
> macromolecules) we unlikely to reach the global minimum; however, reaching
> it is what we aim for (by definition and construction of refinement
> program) .
>
> Pavel
>

--
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A