Hi Dave,
On Mon, Dec 22, 2008 at 02:58:31PM -0500, Borhani, David wrote:
> I think the LSQKAB change at Line 291(old)/Line 300(new) DOES introduce
> new and possibly incorrect logic.
Very possible, but ...
> I haven't looked at all the code, but this one change does seem to
> substitute a check that chain, residue number, and atom name (only 3
> characters; incorrect) match [OLD] for a check that chain, residue
> number, atom name (4 chars, correct), insertion code (correct, assuming
> that the insertion codes and residues numbers in the two proteins are
> lined up correctly), AND ALT CODE match [NEW].
I read that slightly differently:
OLD: check on the first three characters of the atom name
NEW: check on the first three characters of the atom name
AND
check on alternate conformation
AND
check on insertion code
There is no (new) check on chain or residue number - which is correct,
since the LSQKAB syntax allows to specify different chain identifiers
and different residue numbers for the work and reference PDB file.
The new code makes sense to me (but please double check and correct me
if I'm wrong): without it you get a complete mess in the match-up
(since atom name, chain and residue number are simply not enough to
pick one and _only_ one atom).
> The alt code match is, I suspect, a bug, in exactly the situation that
> Jose provided: one protein may have them, but the other may not (or may
> have different ones). One should perform the alignment such that the
> protein (residue) without alt codes aligns onto the other protein
> (residue) with the "A" alt code; to discard the residue pair is simply
> because alt codes don't match is not correct.
I'm not sure about that: LSQKAB is intended to superimposed two sets
of atoms. For that the user needs to specify exactly (!) what atoms
belong into these two sets. Your suggestion of having LSQKAB pick
AltConf "A" instead of an atom without AltConf introduces new logic
into LSQKAB that wasn't there before. So I wouldn't classify that as a
bug, since it does the right thing: making sure that one and _only_
one atom will be picked (whereas before this wasn't guaranteed).
Please note that I don't say your suggestion doesn't make sense: I
like automatic superposition programs that make sensible structural
assumptions and decisions (some LSQMAN commands or SSM in Coot). Just
that LSQKAB isn't really intended that way: it does exactly what it
says on the tin.
As a comparison, I've run a test with three PDB files:
a) just 5 residues
b) same 5 residues, but one side-chain has two alternate
conformations (A and B)
c) same 5 residues, but one residue has insertion code instead of
residue number increment
d) same 5 residues, but now with the alternate conformation
side-chain and the insertion code residue
Running this against 4 LSQKAB binaries:
A) LSQKAB sources from 6.0.2, compiled against 6.0.2 libraries
B) LSQKAB sources from 6.0.2, compiled against 6.1.0 libraries
C) LSQKAB sources from 6.1.0, compiled against 6.0.2 libraries
D) LSQKAB sources from 6.1.0, compiled against 6.1.0 libraries
shows some interesting items (assuming that LSQKAB should match-up
only identical atoms, i.e. leaving your suggestion of automatic
decisions aside).
- the 6.0.2 LSQKAB source shows non-zero RMS values in a variety of
cases
This makes no sense, since for any pair of the above PDB files the
common atoms are identical.
- the 6.0.2 LSQKAB source gives different results when swapping the
two PDB files one superposes (i.e. superposing PDB1 onto PDB2 gives
a different result thatn superposing PDB2 onto PDB1)
Again, this doesn't make sense.
Both of these points are due to the missing checks introduced into
the latest version (which make sure that only identical atoms are
picked).
- the 6.0.1 LSQKAB source always gives rms values of zero and the
order of PDB files doesn't matter.
To me the 6.1.0 sources look correct ... ?
Anyway, getting back to the original question (most people reading the
CCP4bb will be bored by now anyway):
> If I do the same superposition (with a pdb file that contains
> alternative conformations) with LSQKAB version 6.0 and 6.1:
> 1) Version 6.0 reports 110 atoms "to be refined" and does not report any
> error or warning. The loggraph contains data for the residues with
> alternative conformations.
> 2) Version 6.1 reports 97 atoms "to be refined", and it reports 13 atoms
> as "no match for workcd atom [...]". The loggraph does NOT contain data
> for the residues with alternative conformations.
>
> Based on that, I have assumed that version 6.0 does include atoms in
> alternative conformations (in fact, it seems to take into account each
> conformations independently).
I can understand that this looks like a regression in 6.1 (since it
uses more atoms and shows residues with alternate conformations in the
loggraph). But I'm fairly certain that it did the wrong thing
nevertheless, i.e. the superpostion will have been wrong and therefore
rms values, rotation/translation etc as well. If I use two PDB files
1) 5 residues: 37 atoms
2) 5 residues, one sidechain has alternate conformations "A"
(original position) and alternate conformation "B" (different
rotamer): 29 atoms + 2*8 atoms = 45 atoms
I would expect that superposing 1 onto 2 should match-up only common
atoms (i.e. leaving out the side-chain of the alternate conformation
residue completely = 29 atoms) and therefore give rms of 0.0. But
using CCP4 6.0.2:
1 -> 2 : rms = 0.000 for 37 atoms; by chance (?) it seems to pick
AltConf "A")
2 -> 1 : rms = 1.590 for 45 atoms;
> > >
> > > Bye
> > > Jose
> > >
> > > On 12/22/2008 4:31 PM Tim Gruene wrote:
> > >> Not using lsqkab very often, this might be a stupid
> > question: How do
> > >> you know that version 6.0 _DOES_ include multiple
> > conformations? Maybe
> > >> it only does not report their omission?
> > >>
> > >> Tim
> > >>
> > >> --
> > >> Tim Gruene
> > >> Institut fuer anorganische Chemie
> > >> Tammannstr. 4
> > >> D-37077 Goettingen
> > >>
> > >> GPG Key ID = A46BEE1A
> > >>
> > >>
> > >> On Mon, 22 Dec 2008, Jose M de Pereda wrote:
> > >>
> > >>> Dear colleagues,
> > >>>
> > >>> While using LSQKAB I have encountered what it seems a different
> > >>> behavior between version 6.1 and 6.0.
> > >>>
> > >>> If I superpose two structures with LSQKAB version 6.1
> > (included in
> > >>> CCP4-6.1.0), residues with alternative conformations are
> > not included
> > >>> for the calculations. This is an example of the message
> > in the log file:
> > >>>
> > >>> - NO MATCH FOR WORKCD ATOM - 995CA A IN REFRCD FILE
> > >>> - NO MATCH FOR WORKCD ATOM - 995CA A IN REFRCD FILE
> > >>> - NO MATCH FOR WORKCD ATOM - 1009CA A IN REFRCD FILE
> > >>> - NO MATCH FOR WORKCD ATOM - 1009CA A IN REFRCD FILE
> > >>>
> > >>> The program completes the task normally, but it does not use the
> > >>> residues with alternative conformations.
> > >>>
> > >>> In contrast, LSQKAB version 6.0 (included in CCP4-6.0.2) uses the
> > >>> residues with alternative conformations.
> > >>>
> > >>> The documentation of LSQKAB does not have any reference about the
> > >>> treatment of residues with alternative conformations.
> > >>>
> > >>> This problem is not specific to a particular coordinates
> > file. For
> > >>> example, I can reproduce it using PDB entry 1QG3 and superposing
> > >>> residues 1127-1318 of molecule A onto the same range of
> > molecule B.
> > >>>
> > >>> I would appreciate if someone could enlighten me whether
> > this is a
> > >>> new FEATURE of ver 6.1 or a BUG; and how can this be
> > avoided (i.e.
> > >>> include residues with alternative conformations for calculations).
> > >>>
> > >>> Finally, I am running CCP4 6.1.0 in a Linux box with Suse
> > 10.2 (Linux
> > >>> 2.6.18.8-0.10-default i686).
> > >>>
> > >>> Happy holidays and happy New Year
> > >>> Cheers
> > >>>
> > >>> Jose
> > >>>
> > >>> --
> > >>> ------------------------------------------------------------
> > >>> Jose M de Pereda, PhD
> > >>> Instituto de Biologia Molecular y Celular del Cancer (IBMCC)
> > >>> Spanish National Research Council - University of Salamanca
> > >>> Campus Unamuno s/n
> > >>> E-37007 Salamanca, Spain
> > >>> Phone: +34-923-294819
> > >>> Fax: +34-923-294795
> > >>> http://xtal.cicancer.org/
> > >>> ------------------------------------------------------------
> > >>>
> > >>
> > >
> >
--
***************************************************************
* Clemens Vonrhein, Ph.D. vonrhein AT GlobalPhasing DOT com
*
* Global Phasing Ltd.
* Sheraton House, Castle Park
* Cambridge CB3 0AX, UK
*--------------------------------------------------------------
* BUSTER Development Group (http://www.globalphasing.com)
***************************************************************
Alt = side-chain (>=CB) of PHE in alternate conformations, i.e. 8 atoms
Ins = PRO with insertion code, i.e. 7 atoms
WORKCD REFRCD
src lib Alt Ins Alt Ins Nwrk Nuse Nref rms
--------------------------------------------------------------
6.0.2 6.0.2 NoAlt NoIns NoAlt NoIns 37 37 37 0.000 *
NoAlt NoIns NoAlt Ins 37 30 37 0.000
NoAlt NoIns Alt NoIns 37 37 45 0.000
NoAlt NoIns Alt Ins 37 30 45 0.000
NoAlt Ins NoAlt NoIns 37 35 37 1.203
NoAlt Ins NoAlt Ins 37 37 37 1.187 *
NoAlt Ins Alt NoIns 37 35 45 1.203
NoAlt Ins Alt Ins 37 37 45 1.187
Alt NoIns NoAlt NoIns 45 45 37 1.590
Alt NoIns NoAlt Ins 45 38 37 1.714
Alt NoIns Alt NoIns 45 45 45 1.590 *
Alt NoIns Alt Ins 45 38 45 1.714
Alt Ins NoAlt NoIns 45 43 37 1.892
Alt Ins NoAlt Ins 45 45 37 1.858
Alt Ins Alt NoIns 45 43 45 1.892
Alt Ins Alt Ins 45 45 45 1.858 *
6.1.0 NoAlt NoIns NoAlt NoIns 37 37 37 0.000 *
NoAlt NoIns NoAlt Ins 37 30 37 0.000
NoAlt NoIns Alt NoIns 37 37 45 0.000
NoAlt NoIns Alt Ins 37 30 45 0.000
NoAlt Ins NoAlt NoIns 37 35 37 1.203
NoAlt Ins NoAlt Ins 37 37 37 1.187 *
NoAlt Ins Alt NoIns 37 35 45 1.203
NoAlt Ins Alt Ins 37 37 45 1.187
Alt NoIns NoAlt NoIns 45 45 37 1.590
Alt NoIns NoAlt Ins 45 38 37 1.714
Alt NoIns Alt NoIns 45 45 45 1.590 *
Alt NoIns Alt Ins 45 38 45 1.714
Alt Ins NoAlt NoIns 45 43 37 1.892
Alt Ins NoAlt Ins 45 45 37 1.858
Alt Ins Alt NoIns 45 43 45 1.892
Alt Ins Alt Ins 45 45 45 1.858 *
6.1.0 6.0.2 NoAlt NoIns NoAlt NoIns 37 37 37 0.000 *
NoAlt NoIns NoAlt Ins 37 30 37 0.000
NoAlt NoIns Alt NoIns 37 29 45 0.000
NoAlt NoIns Alt Ins 37 22 45 0.000
NoAlt Ins NoAlt NoIns 37 30 37 0.000
NoAlt Ins NoAlt Ins 37 37 37 0.000 *
NoAlt Ins Alt NoIns 37 22 45 0.000
NoAlt Ins Alt Ins 37 29 45 0.000
Alt NoIns NoAlt NoIns 45 29 37 0.000
Alt NoIns NoAlt Ins 45 22 37 0.000
Alt NoIns Alt NoIns 45 45 45 0.000 *
Alt NoIns Alt Ins 45 38 45 0.000
Alt Ins NoAlt NoIns 45 22 37 0.000
Alt Ins NoAlt Ins 45 29 37 0.000
Alt Ins Alt NoIns 45 38 45 0.000
Alt Ins Alt Ins 45 45 45 0.000 *
6.1.0 NoAlt NoIns NoAlt NoIns 37 37 37 0.000 *
NoAlt NoIns NoAlt Ins 37 30 37 0.000
NoAlt NoIns Alt NoIns 37 29 45 0.000
NoAlt NoIns Alt Ins 37 22 45 0.000
NoAlt Ins NoAlt NoIns 37 30 37 0.000
NoAlt Ins NoAlt Ins 37 37 37 0.000 *
NoAlt Ins Alt NoIns 37 22 45 0.000
NoAlt Ins Alt Ins 37 29 45 0.000
Alt NoIns NoAlt NoIns 45 29 37 0.000
Alt NoIns NoAlt Ins 45 22 37 0.000
Alt NoIns Alt NoIns 45 45 45 0.000 *
Alt NoIns Alt Ins 45 38 45 0.000
Alt Ins NoAlt NoIns 45 22 37 0.000
Alt Ins NoAlt Ins 45 29 37 0.000
Alt Ins Alt NoIns 45 38 45 0.000
Alt Ins Alt Ins 45 45 45 0.000 *
|