Dear Rasmus and Tim,
I hope I don't annoy you too much but I'm still convinced
that r^-1/6 averages are the not the thing that one should be
interested in. IMHO there is 'One Best Way To Do It', and I hope I
can persuade you of my point of view, especially since the concept
Resonance Object in the CCPN datamodel make it relatively easy to
do it right.
I agree that it's a complicated situation with different kinds
of distances, distance averages, different kinds of ambiguities.
Let's not make it overly complicated by confusing inexperienced
users with flawed concepts like r^-1/6 averages.
The r^-1/6 sum should be seen as a convenient representation of the NOE
intensity (measured or calculated) in a familiar unit. When there are
contributions from multiple pairs to the NOE, these contributions always add up
and are never averaged.
During typing this I realized there *is* a use case for r^-1/6 averages
and that is ensemble and/or time-averaging. But then the averaging is only
over distances between identical pairs in different models, and never over the
different components of a distance restraints. (and for time averaging r^-1/3 is
supposed to be better.
Could it be that where you say you want to use the r^-1/6 average
of a set of distances, you are really interested in the _minimum_
distance of a number of possibilities, not actually in any kind of average
over those possiblilites?
If you want to evaluate vdW clashes between two groups of atoms (perhaps resonance
objects)
you want the minimum distance between the individual atoms, because that is
the number that determines whether there's a violation. It's best to
view vdW clashed this entirely separate from NOE analysis.
In case of a prochiral ambiguity of two e.g. methyl groups, it's also a
minimum distance that is the relevant number, but now it actually the minimum
of several r^-1/6 sum distances.
there are a few short comments in between your text and an example of what
I think should be the default below.
Rasmus Fogh wrote:
> Dear Eiso,
>
> In answer to your questions:
>
> The problem with the r^-1/6 sum is that it does not correspond to any
> distance. It is what the distance would have been if there had been only
> one proton. That is the correct value to compare to the distance
> constraint (which also does not correspond to any real distance in cases
> with multiple assignment). But if I want to get an idea about what is
> going on, it is nice to have at least the option of finding out how far
> away things are in reality. Tim had an example with prochiral methyl
the r^-1/6 average also does not correspond to any distance in 'reality',
certainly not more than the sum, which at least corresponds to a distance
derived from the data. [idem for the geometric average]
> groups: r^-1/6 average 4.2A, r^-1/6 sum 1.8A. Now, if those methyl groups
> were really 1.8A apart they would be in van der Waals contact. They are
> not.
if you want to evaluate vdW clashes, just look at the individual distances
or the minimum distance if it's between groups with multiple atoms
no special reason here for the r^-1/6 average/sum, geometric average or any other
average to be relevant
>
> By all means use r^-1/6 sum as the default, I would say, but leave r^-1/6
> average as an alternative for tables and display. What you use for your
> dynamics calculations is another matter.
ok let's focus on the analysis of distances. I hope we agree that
r^-1/6 average distance restraints should not be used in stucture calculations
>
> Yours,
>
> Rasmus
>
Tim Stevens wrote:
>>>As I remember r^-1/6 sum is used in all calculations, constraint lists,
>>>etc. r^-1/6 average is used in (some of?) the menus and in the structure
>>>viewer. I thought that was actually deliberate. The reason would be that
>>>the r^-1/6 sum of a restraint to e.g. a methyl group would be clearly
>>>shorter than the distance between any two individual protons. For a
>>>methyl-methyl restraint it is even worse.
>>
>>why is that bad? what do you need the individual proton-proton
>>distances for except for calculation the -1/6 sum? which is the
>>quantitity that should be compared with the distance that is
>>determined from the NOE.
>
>
> The r^-1/6 sum assumes physical ambiguity, i.e. multiple contributions.
>
> This is not always the case.
very true.
>
> As an example, take a constraint between two non-stereospecifically
> resolved prochiral methyl groups. Using a real structure, for 12LeuHdb* -
> 42ValHga*, the NOE sum is 1.850 and the NOE mean is 3.362. In this case
yes , the ratio between the two is fixed for a certain the number of atoms
in each group: 1.85*(1/(6*6))^(-1/6) = 3.36167
> there is logical ambiguity and there really is only one contribution
> between two methlys. Using an NOE sum here is misleading at best.
not more misleading than the average I would say. If one of the distances is
exactly equal to the upperbound (so the agreement between model and data is perfect)
the r^-1/6 average will still give a violation w.r.t the upperbound.
for a 2.0 A NOE and one of the pairs at 2.0A and one at 3.0A
sum : ( 1/2.0^6 + 1/3.0^6 )^-1/6 == 1.972
ave : (( 1/2.0^6 + 1/3.0^6 )/2)^-1/6 == 2.213
so the r^-1/6 average calculated from the model violates the restraints.
If the NOE sum is violated, then it follows that there must a violation
if the assignments were known. This does not hold for the average.
Isn't it actually the minimum distance (of the 4 sum averaged distances) that
is the most interesting figures in this case?
>
> Also I might just want to do a seeminly simple thing and know an
> approximation to a real distance. Say if comparing to a crystal structure.
For a resonance with equivalent protons the r^-1/6 sum should be used.
For a case of multiple (for prochiral type ambiguities mutually exlusive)
possibilities the minimum distance is the relevant number.
Only in the case comparing your data with an ensemble of multiple models
r^-1.6 averages over the same pair distances in each model makes sense.
>
> Sure, we can have the default as NOE sum if that's what people are doing
> most often. But Analysis should not be so restrictive and dictatorial to
> assume that all people would only be interested in working in ARIA-space
> at all times.
>
It's not so much ARIA or not, but supplying the correct approach as the default to
compare distances in the protein model to distances from the NOEs.
CANDID works the same way btw.
> I think Igor's suggestion was a good one, and the two options will stay.
> This also gives the opportunity to not be restricted to the NOE. There are
> other kinds of distance relationships that are used in NMR, thinking
> initially about solid state and HADDOCK-like constraints.
This is a bit vague. HADDOCK exlusively uses the sum, (as it should)
Could you give one specific example where the average would be better than the sum
or the minimum?
>
> T.
>
For your example above, I would calculate the distances in the following way:
12LeuHdb* - 42ValHga*,
there are (3+3)*(3+3) = 36 individual interproton distances. between
methylgroups these are not interesting
first apply r^1/6 sum for the equivalent methyl protons, so that we are left
with the distances between groups of protons that correspond to a resonance
12LeuHD1# 43ValHG1# 3.45A
12LeuHD2# 43ValHG1# 1.85A
12LeuHD1# 43ValHG2# 3.01A
12LeuHD2# 43ValHG2# 4.59A
these are the only `distances` that are interesting when evaluating NOEs
assume that the 2nd possibility is correct, so the model is in perfect
agreement with the NOE data.
Now if you insist on capturing the distances corresponding this in one number
now the numbers:
r^1/6 sum : 1.826 - too short but at least no violated.
r^1/6 ave : 2.300 - (averaging of 4 distances) this distance violates
the restraint by 0.3 A
r^1/6 ave : 3.317 - (averaging over 6*6 distances, close to your 3.362A)
violation even worse.
minimum : 1.85 - bingo!
so the minimum is the most useful figure in this case and the sum is only
slightly worse; both sum and the minimum will never be violated in a correct model
In order to produce the 4 numbers above,
the datamodel needs to differentiate between the following cases. I'm assuming
that it can already do that.
1. ambiguity because of equivalent protons (methyl, flipping aromatic ring protons)
there is not really an ambiguity here. all protons contribute equally. it's just easy
to evaluate a static model with an ambiguous restraint.
2. prochiral ambiguity (exclusive OR type) -> use minimum
3. ambiguity because of overlap between non-equivalent protons
(could be a combination of 2 and 3) -> use sum
hope you made it up to this point....
kind regards,
Eiso
|