Thanks, Dale and others. In trying to answer the question I discovered that I did not know the answer.
But I should correct some of my mis-statements. If I had taken a minute to check wikipedia instead of going on 3-decades-old memories-
-The full matrix optimization being described is called the Gauss-Newton algorithm (GNA) or just "Newton-type methods". Newton-Raphson is something else. (one wonders what kind of computer Newton used to do his iterations!)
- Farther off-topic:
in Marquardt-Meiron, Marquardt is the real contribution- Meiron proposed one of several algorithm for choosing the balance between gradient-search and Newton-type method at any stage in refinement. And it seems Levenberg independently and previously described the Marquardt method or something like it, so in Wikipedia it is now the Levenberg-Marquardt algorithm (LMA).
eab
On 11/10/2015 12:42 AM, Dale Tronrud wrote:
> On 11/9/2015 7:53 PM, Edward A. Berry wrote:
>>
>>
>> On 11/09/2015 09:46 PM, Keller, Jacob wrote:
>>> Derivatives of what?
>>>
>>> JPK
>>
>>
>> Derivatives of the target function with respect to the parameters.
>>
>> The minimum (or maximum) of the target function is where all the first
>> derivatives are zero. So finding that minimum is solving for the point
>> where first derivatives are zero, based on the derivatives at the
>> current point in parameter space. This is a nonlinear problem and not
>> directly solvable. However if you you take the approximation that in a
>> small enough region around the current point, the first derivatives are
>> linear functions of the parameters, the dependencies being the
>> derivatives of each first derivative wrt each parameter, then this
>> simple problem is solvable by matrix inversion. The derivatives of each
>> first 1st derivative wrt each parameter are just the 2nd derivatives
>> (including mixed), so you are inverting a matrix of 2nd derivatives of
>> the taret function wrt each pair of parameters. Initially you are far
>> from the optimum and the answer is wrong, but hopefully closer, so you
>> iterate. Sometimes it is better to start with a gradient search when you
>> are far from the optimum
>> Marquardt-Meiron is an algorithm which starts out as nearly a
>> gradient search and switches toward the (newton-raphson?) approach
>> described above as the search converges. Once you are very near the
>> optimum, the linear approximation gets better and better and convergence
>> is supposedly rapid.
>>
>> And now the point being discussed here involves estimating the
>> uncertainty of the value of the optimized parameters, given the
>> uncertainty in the data (Fobs), translating into uncertainty in the
>> target function? and I am even more out of my depth here, so I better
>> stop. But in general error propagation depends on derivatives wrt
>> whatever the error is propagating from, and you can imagine mixed second
>> derivatives would affect the ability of parameters to compensate for
>> each other?
>
> The next step is to switch from a least-squares formalism to one
> based on probabilities. If you assume that you are trying to find the
> maximum of a probability distribution which is a function of the
> parameters of your model you need to calculate the first and second
> derivatives of that distribution with respect to those parameters.
> Assume your distribution is a Normal distribution and take the
> derivatives yourself! You will see how the covariance matrix and the
> second derivatives are related.
>
> Basically, if you have a narrow, sharp distribution (large curvature)
> you have a very precise estimate for that parameter. If you have a
> broad, flat distribution (small curvature) you don't have much certainty
> about that parameter and your sigma is large. The covariances also fall
> out and they are often more important, especially when it comes to the
> occupancy and B factor of any particular atom.
>
> The equation only works when your parameters are smack-dab at the
> optimum so you had better refine your model to convergence if you want
> reliable sigmas. This is why you have to run full-matrix refinement
> before you invert the matrix to get the sigmas - nothing else I know of
> will get you close enough.
>
> "Data Analysis - A Bayesian Tutorial" by Sivia has some great stuff
> on this topic.
>
> Dale Tronrud
>>
>>>
>>> *From:*Ethan Merritt [mailto:[log in to unmask]]
>>> *Sent:* Monday, November 09, 2015 9:36 PM
>>> *To:* Keller, Jacob
>>> *Cc:* [log in to unmask]
>>> *Subject:* Re: [ccp4bb] precision of bond lengths and angles
>>>
>>> On Tuesday, 10 November 2015 02:31:35 AM Keller, Jacob wrote:
>>>
>>>> I have heard this subject bandied about over the years, and it sounds
>>>> interesting. I wonder whether someone can point to the original
>>>> source for the procedure? Particularly I am curious what matrix is
>>>> being inverted and what that means. What are the numbers in the matrix?
>>>
>>> The Hessian matrix of partial derivatives
>>>
>>> https://en.wikipedia.org/wiki/Hessian_matrix
>>>
>>> Ethan
>>>
>>>>
>>>
>>>> Thanks,
>>>
>>>>
>>>
>>>> Jacob Keller
>>>
>>>>
>>>
>>>> Ps I did do a quick Google, but found nothing fundamental-only
>>>> applications.
>>>
>>>>
>>>
>>>>
>>>
>>>> From: CCP4 bulletin board [mailto:[log in to unmask]] On Behalf Of
>>>> George Sheldrick
>>>
>>>> Sent: Monday, November 09, 2015 6:19 PM
>>>
>>>> To: [log in to unmask] <mailto:[log in to unmask]>
>>>
>>>> Subject: Re: [ccp4bb] precision of bond lengths and angles
>>>
>>>>
>>>
>>>> Dear Dale,
>>>
>>>>
>>>
>>>> There are a number of strategies for getting (almost) full-matrix
>>>> esds without causing the program to blow up, for example you can
>>>> selectively remove the geometric restraints on the parts of the
>>>> structure that you are interested in and are relatively well-behaved,
>>>> or you can block out the ADP refinement etc. Of course there may be
>>>> flexible regions for which it is not possible to get meaningful esds,
>>>> but they can still be included in the refinement with suitable
>>>> restraints. An adequate resolution (i.e. data to parameter ratio) is
>>>> of course essential.
>>>
>>>>
>>>
>>>> Best wishes, George
>>>
>>>>
>>>
>>>>
>>>
>>>>
>>>
>>>> On 11/09/2015 10:23 PM, Dale Tronrud wrote:
>>>
>>>>
>>>
>>>> The key issue is inverting the matrix. The Normal matrix is almost
>>>
>>>>
>>>
>>>> always singular unless you have huge amounts of x-ray data and a well
>>>
>>>>
>>>
>>>> ordered molecule (which often go hand in hand). My understanding is
>>>
>>>>
>>>
>>>> that George prefers to calculate esds from a matrix calculated with no
>>>
>>>>
>>>
>>>> influence from the restraints, so the problem is even worst. In Shelxl
>>>
>>>>
>>>
>>>> you can turn on the restraints to stabilize the inversion but you have
>>>
>>>>
>>>
>>>> to realize that the uncertainty in a bond length will be determined
>>>
>>>>
>>>
>>>> almost entirely by the restraint and not the X-ray data and for those
>>>
>>>>
>>>
>>>> uncertainties it would certainly be quicker to simply look in your
>>>
>>>>
>>>
>>>> restraint file to find the "sigma". Even then at lower resolutions (and
>>>
>>>>
>>>
>>>> George just posted that this limit is somewhere around 1.6 A) the matrix
>>>
>>>>
>>>
>>>> will be singular and you will have to bolster it with more information.
>>>
>>>>
>>>
>>>>
>>>
>>>>
>>>
>>>> Remember those atoms being discussed in another thread today? The
>>>
>>>>
>>>
>>>> ones being build into little to no density because we know they have to
>>>
>>>>
>>>
>>>> be somewhere in that area? They will certainly blow up your inversion.
>>>
>>>>
>>>
>>>> If you can move a parameter about and it makes no change to the fit to
>>>
>>>>
>>>
>>>> any of your data (x-ray or stereochemical) then your first derivative is
>>>
>>>>
>>>
>>>> constant at zero and your second derivative is zero and your inversion
>>>
>>>>
>>>
>>>> is toast. More interesting, but harder to identify, are the cases where
>>>
>>>>
>>>
>>>> you can move a subset of parameters in a coordinated fashion without
>>>
>>>>
>>>
>>>> changing the fit. This also creates a singularity.
>>>
>>>>
>>>
>>>>
>>>
>>>>
>>>
>>>> An alternative to finding more data (or trimming the invisible atoms
>>>
>>>>
>>>
>>>> from your model) is to filter out the singularities during the
>>>
>>>>
>>>
>>>> inversion. This can be done with eigenvalue filtering or single value
>>>
>>>>
>>>
>>>> decomposition, but these methods are very time consuming for large
>>>
>>>>
>>>
>>>> matrices like these and I'm not aware that anyone has implemented them
>>>
>>>>
>>>
>>>> in macromolecular refinement.
>>>
>>>>
>>>
>>>>
>>>
>>>>
>>>
>>>> Dale Tronrud
>>>
>>>>
>>>
>>>>
>>>
>>>>
>>>
>>>> On 11/9/2015 9:59 AM, Bernhard Rupp (Hofkristallrat a.D.) wrote:
>>>
>>>>
>>>
>>>> inverting the full matrix of second derivatives, and I believe that only
>>>
>>>>
>>>
>>>> Shelxl can do that and then only in the presence of very high resolution
>>>
>>>>
>>>
>>>> x-ray data.
>>>
>>>>
>>>
>>>>
>>>
>>>>
>>>
>>>> This I do not completely understand: As long as I can set up a LSQ
>>>> design
>>>
>>>>
>>>
>>>> (normal equation) matrix, I can make a second derivative matrix, and
>>>> invert
>>>
>>>>
>>>
>>>> it. So in principle as long as I am determined, I can do this. If it
>>>> makes
>>>
>>>>
>>>
>>>> sense in absence of a limited number of data points to refine
>>>> against, is a
>>>
>>>>
>>>
>>>> different question, and the O(n^p) with p > 2.4 at best adds a
>>>
>>>>
>>>
>>>> computational problem for large n as in macro.
>>>
>>>>
>>>
>>>>
>>>
>>>>
>>>
>>>> So where reasonably would be the data/parameter ratio (or resolution)
>>>> where
>>>
>>>>
>>>
>>>> an inversion makes sense as far as the accuracy of the (co)variances go,
>>>
>>>>
>>>
>>>> irrespective of the time requirements for inversion?
>>>
>>>>
>>>
>>>>
>>>
>>>>
>>>
>>>> Thx, BR
>>>
>>>>
>>>
>>>>
>>>
>>>>
>>>
>>>> Dale Tronrud
>>>
>>>>
>>>
>>>>
>>>
>>>>
>>>
>>>> On 11/9/2015 2:15 AM, Zhenyao Luo wrote:
>>>
>>>>
>>>
>>>> Dear CCP4 community,
>>>
>>>>
>>>
>>>>
>>>
>>>>
>>>
>>>>
>>>
>>>>
>>>
>>>>
>>>
>>>>
>>>
>>>> I would like to ask a question regarding determining the precision of
>>>
>>>>
>>>
>>>> bond lengths and angles in a protein crystal structure.
>>>
>>>>
>>>
>>>>
>>>
>>>>
>>>
>>>> I am currently analysing the bond lengths and angles of the Zn-ligand
>>>
>>>>
>>>
>>>> coordination bonds in some Zn-binding protein crystal structures. For
>>>
>>>>
>>>
>>>> the bond length/angle values, I intended to keep two decimal places
>>>
>>>>
>>>
>>>> for the bond lengths and one decimal place for the bond angles. But is
>>>
>>>>
>>>
>>>> it really justified to quote the values to that precision? Also, how
>>>
>>>>
>>>
>>>> do you usually determine the appropriate precision (how many decimal
>>>
>>>>
>>>
>>>> places to keep) when reporting bond lengths/angles values at a given
>>>
>>>>
>>>
>>>> resolution?
>>>
>>>>
>>>
>>>>
>>>
>>>>
>>>
>>>> Any advice would be greatly appreciated. Thanks very much in advance.
>>>
>>>>
>>>
>>>>
>>>
>>>>
>>>
>>>>
>>>
>>>>
>>>
>>>>
>>>
>>>>
>>>
>>>> Best wishes, Zhen
>>>
>>>>
>>>
>>>>
>>>
>>>>
>>>
>>>>
>>>
>>>>
>>>
>>>>
>>>
>>>> --
>>>
>>>>
>>>
>>>> Prof. George M. Sheldrick FRS
>>>
>>>>
>>>
>>>> Dept. Structural Chemistry,
>>>
>>>>
>>>
>>>> University of Goettingen,
>>>
>>>>
>>>
>>>> Tammannstr. 4,
>>>
>>>>
>>>
>>>> D37077 Goettingen, Germany
>>>
>>>>
>>>
>>>> Tel. +49-551-39-33021 or -33068
>>>
>>>>
>>>
>>>> Fax. +49-551-39-22582
>>>
>>>>
>>>
>>>>
>>>
>>> --
>>>
>>> mail: Biomolecular Structure Center, K-428 Health Sciences Bldg
>>>
>>> MS 357742, University of Washington, Seattle 98195-7742
>>>
>
|