Thank you for that, Rachel

Even though the tone of your comment does not suggest that you want to carry on a dialogue about this, I thought I would reply in any case - since dialogue is what this forum is supposed to be about.

Thing is,  I was sort of looking for an explanation of why the rule was adopted that waters were to be renumbered from N to C terminus. If this is not functioning to put the waters in register across a set of related structures then it seems somewhat arbitrary. And other schemes might be suggested to be better for the "usability and interpretation of the structural data". 

In some cases there are only a few waters but in many structures the wwPDB partners renumber hundreds. And this process makes it difficult for authors to check the final deposited structure against the output of their refinement. 

I have to say that I agree with other contributors to this thread. It would be much better to let the refinement program authors agree on a default water numbering scheme. And then maintain that through deposition. 

I thought of six possible schemes before breakfast... one of my favourites was to order by B-factor - which might appeal to crystallographers. Another was to give priority to those in the coordination sphere of any metal ions - these actually get priority in the PDB as they are included in the LINKS records above the coordinates. These coordinated waters are often refined together with the metals and so it would make sense to move them closer to their friendly ion.    

And of course one other clearly suitable option would be to leave the waters in the authors' preferred order - chosen with help from their refinement suite. This is what happens during deposition with the residues of the polymers - (provided the authors chainids are suitably chosen). Following your link the rule for polymers is that: 'If the coordinate residue numbers, as provided by the author, are unique and sequential within a particular chain ID, the residues will not be renumbered.' 

I'm presuming that if the authors have a preferred suitable set of water numbers then that would be maintained similarly?

Perhaps that is what is happening in the cases I notice that do not follow wwPDB rules?

On Friday I was looking at TIRAP structures and in 3ub2 the protein construct starts at residue 78 and its final residue is 221 - but the associated DTT is labelled back at residue 1 in the same chain. Then the first ten out of eleven waters are residues 2 to 11 but then oddly the eleventh water is residue 222. Is there a difference in this C-terminal water compared with the N-terminal ones? I imagined it was perhaps maintained to fit in with the associated publication - or maybe started out life modelled as a metal ion - unfortunately I can find no mention of it in the paper. 

But, regardless of this distracting feature,  surely this entry does not conform to the expected numbering scheme you mentioned as the wwPDB standard: polymer -> heterogen -> water? 

Yours perplexedly
 Martyn 


From: Rachel Kramer Green <[log in to unmask]>
To: [log in to unmask]
Sent: Friday, 1 November 2013, 20:18
Subject: Re: [ccp4bb] Comparison of Water Positions across PDBs

In PDB format files, each polymer is assigned a unique chain ID. Chain IDs for all bound moieties and waters are assigned based on their proximity (number of contacts) to the nearest polymer. Once the polymers and non-polymer residues associated with them are assigned chain IDs, they are also assigned unique residue numbering with the order polymer residues, ligands and then waters.

Please see: http://www.wwpdb.org/procedure.html#toc_4

The wwPDB has established this rule to improve the usability and interpretation of the structural data. Assigning the same chain ID for all moieties associated with a polymer enables rapid and uniform identification of feature analysis.

Sincerely,
Rachel Green

Rachel Kramer Green, Ph.D.
RCSB PDB
 
 
 
On 10/30/2013 8:09 AM, Eugene Krissinel wrote:
This is to be answered by PDB people, who definitely read BB :)

Would be nice to have a tool common between CCP4/Phenix and the PDB which sorts this out

Eugene

On 30 Oct 2013, at 12:09, Andreas Förster wrote:

Dear all,

this water discussion is flowing increasingly towards a place where I feel a bit out of my depth.

What is the convention for numbering water molecules?  Is there preference for:

- putting waters into a separate chain (W for water or S for solvent)?
- splitting waters according to the peptide chains in the structure?
- appending all waters to chain A?


Thanks.


Andreas




On 30/10/2013 11:57, MARTYN SYMMONS wrote:
At deposition the PDB runs a script that renumbers authors'  waters
according to a scheme based on the residue they are nearest from N to C
terminus along each chain. This renumbering started  when waters were
assigned to macromolecular chains rather than getting a chain id of
their own.  I have failed to find the rationale explained in any PDB
documents - but it could be motivated by this sort of consideration when
waters from different chains or entries are to be compared. Having said
that I do not know if there are any cases where this approach has
successfully matched waters. ..

However an associated step which is certainly a help is that, in the
case of multiple chains, the crystal symmetry is applied to replace
waters with their symmetry equivalent position if it is closer to a
different chain.

I believe a freely available program implementing a similar approach is
WATERTIDY in CCP4 which might be a good place to start.  It gives a
pretty complete output, detailing residues actually H-bonded to the
waters, and you could parse that for further analysis and comparisons.

Best wishes.
  Martyn
-- 
                 Andreas Förster
    Crystallization and Xray Facility Manager
          Centre for Structural Biology
             Imperial College London