Hi
I have a copy of the PDB format definition that I acquired in the
1980's that defines the order of the atoms for each amino acid type.
Unfortunately this is on paper in my file cabinet at the lab and I'm
at home right now. The code I wrote based on this document orders
the atoms as Clemens shows in his first example: First the main
chain as N, CA, C, and O. Then the side chain with CB first,
then all the "G" atoms in numerical order, then the "D" atoms in
numerical order, and so on.
If the PDB folks have removed this specification from their
documentation they probably thought hard about doing it and things
might have changed. However, any PDB I've looked at downloaded
from the PDB obeys this ordering rule.
My recollection is that the document also defined the order of
the atoms in things like ATP and NAD but I wasn't crazy enough to
write code for them. While I have run into programs that insisted
that amino acids be in the standard order I've yet to see a program
that makes assumptions about the order of the atoms in an ATP.
Dale Tronrud
P.S. Clemens, you are thinking like a graph theorist when you want
to place CB right after CA. I think this ordering stuff goes way
back and the idea was that the main chain would always be the first
four atoms. This allows one to select the backbone of the peptide
w/o anything complicated like testing the equality of strings.
Clemens Vonrhein wrote:
> Hi Paul,
>
> it seems the atom order for the main chain of proteins is part of the
> standard: http://www.wwpdb.org/documentation/format23/sect9.html#ATOM
> states that
>
> ATOM records for proteins are listed from amino to carboxyl
> terminus.
>
> But what does that mean for the side-chain? I can't find anything in
> there ... apart from the shown example:
>
> ATOM 145 N VAL A 25 32.433 16.336 57.540 1.00 11.92
> ATOM 146 CA VAL A 25 31.132 16.439 58.160 1.00 11.85
> ATOM 147 C VAL A 25 30.447 15.105 58.363 1.00 12.34
> ATOM 148 O VAL A 25 29.520 15.059 59.174 1.00 15.65
> ATOM 149 CB AVAL A 25 30.385 17.437 57.230 0.28 13.88
> ATOM 150 CB BVAL A 25 30.166 17.399 57.373 0.72 15.41
> ATOM 151 CG1AVAL A 25 28.870 17.401 57.336 0.28 12.64
> ATOM 152 CG1BVAL A 25 30.805 18.788 57.449 0.72 15.11
> ATOM 153 CG2AVAL A 25 30.835 18.826 57.661 0.28 13.58
> ATOM 154 CG2BVAL A 25 29.909 16.996 55.922 0.72 13.25
>
> which seems to suggest for proteins: first main-chain atoms ordered
> from amino to carboxyl terminus, followed by side-chain. Which seems
> to contradict the specification above: if I walk along atoms in an
> amino acid from amino to carboxyl terminus, I encounter the side-chain
> when I hit the CA atom. So then I would be inclined to use
>
> ATOM 145 N VAL A 25 32.433 16.336 57.540 1.00 11.92
> ATOM 146 CA VAL A 25 31.132 16.439 58.160 1.00 11.85
> ATOM 147 CB AVAL A 25 30.385 17.437 57.230 0.28 13.88
> ATOM 148 CB BVAL A 25 30.166 17.399 57.373 0.72 15.41
> ATOM 149 CG1AVAL A 25 28.870 17.401 57.336 0.28 12.64
> ATOM 150 CG1BVAL A 25 30.805 18.788 57.449 0.72 15.11
> ATOM 151 CG2AVAL A 25 30.835 18.826 57.661 0.28 13.58
> ATOM 152 CG2BVAL A 25 29.909 16.996 55.922 0.72 13.25
> ATOM 153 C VAL A 25 30.447 15.105 58.363 1.00 12.34
> ATOM 154 O VAL A 25 29.520 15.059 59.174 1.00 15.65
>
> But neither of these two possibilities seem to be explicitely
> stated. Maybe coot could remember the ordering used on the input PDB
> file and stick with it?
>
> Cheers
>
> Clemens
>
> On Mon, Dec 10, 2007 at 04:25:28PM +0000, Paul Emsley wrote:
>> You are right to complain that Coot reorders your atoms - the atom
>> ordering is part of the PDB standard.
>
|