Print

Print


Dear Martyn,

Thank you for your additional comments. One of the future remediation projects will be to address carbohydrate-containing entries.

Best wishes,
Rachel Green

Rachel Kramer Green, Ph.D.

RCSB PDB

[log in to unmask]

 

 

Twitter: https://twitter.com/#!/buildmodels

Facebook: http://www.facebook.com/RCSBPDB

 

On 11/17/2013 3:39 PM, MARTYN SYMMONS wrote:
[log in to unmask]" type="cite">
Hmm
So the backstory for the problematic ligand R12 in this thread is that a sharp-eyed worker at the wwPDB recently spotted that there was an error compared with the original 1999 paper. Correcting the R12 ligand is a friendly gesture from the PDB as it appears that the error must have been the authors' - the atom correcting the R12 ligand has been inserted by the PDB staff rather than retreived from a deposited structure.
 
It is a shame the same helpful approach is not always applied. One current example I spotted is a separate 'problematic ligand' 5AX which has been added by the wwPDB to at least four other authors' entries, starting in 2006 with the latest in 2009.
 
5AX is basically a fragment ligand which the PDB software produces if a NAG has wandered too far from its Asn sidechain during refinement. If 5AX is generated during the PDB processing of a deposition, then it really should be highlighted for the authors as a geometric issue - rather than, as in these cases, being simply added to the coordinates.
 
Reading the authors' papers for the 5AX-containing entries makes it clear that they never expected anything other than NAG to appear in their deposited coordinates.
 
And given its artifactual production during deposition, 5AX should never have 'escaped into the wild'.
 
So if a retrospective fix can be applied to R12 (which similar in lacking an atom) then it seems to me that, in fairness, a clean up of the 5AX entries should be arranged.
 
Yours (not holding his breath),
Martyn
 
 

 
From: Rachel Kramer Green <[log in to unmask]>
To: [log in to unmask]
Sent: Wednesday, 6 November 2013, 16:49
Subject: Re: [ccp4bb] Problematic PDBs

Dear Martyn,

wwPDB staff regularly reviews and remediates PDB data and related dictionaries such as the Chemical Component Dictionary (CCD).

As part of our on-going remediation efforts, the chemical components in the archive are regularly reviewed to ensure the correctness and the completeness of the chemical representation. Such reviews show that in some cases, the author has failed to provide a complete description of the chemistry. To address any such errors, the definitions are corrected. The chemical name and formula are changed in the PDB file, but the coordinates are not changed.

In the case of entry 3CBS, issues were found with the chemical component definition for its ligand R12. The methyl group was not in the deposited coordinates and it was missing from the original definition. In addition, the bond order in one of the carbon-carbon bonds was incorrectly defined. The CCD definition for R12 was updated in 2011 to add the methyl group and to correct the bond order based on information in the primary citation. The coordinates for this PDB entry were not changed. Therefore, in accordance with wwPDB policy, the file was not obsoleted.

Sincerely,
Rachel Green

Rachel Kramer Green, Ph.D.
RCSB PDB
 
 
 
On 10/21/2013 6:28 AM, MARTYN SYMMONS wrote:
As a postscript it might be worth mentioning one problematic ligand that suggested to me a way to correct some of the errors mentioned in this thread
 
R12 is indicated as 9-(4-HYDROXY-2,6-DIMETHYL-PHENYL)-3.... in the  most recent Coot monomer library. But in the PDB ligand description it is 9-(4-hydroxy-2,3,6-trimethylphenyl)-3,7-dimethylnona-2,4,6,8-tetraenoic acid with an additional carbon C16. To make a long story short this ligand was originally deposited missing this extra methyl goup in 1999 (as part of 3CBS) and then apparently updated in 2011 by the PDB.

(the relevant lines in the cif are
<<snip>>
R12 C16 C16 C 0 1 N N N ?      ?      ?      -6.631 1.502  0.990  C16 R12 44 
R12 H1  H1  H 0 1 N N N ?      ?      ?      -6.602 1.511  2.080  H1  R12 45 
R12 H23 H23 H 0 1 N N N ?      ?      ?      -6.422 2.503  0.613  H23 R12 46 
R12 H24 H24 H 0 1 N N N ?      ?      ?      -7.619 1.186  0.656  H24 R12 47 
<<snip>> 

with the ? ? ? indicating that refined coordinates were not available at the time of the update. There was initially an explanation line at the end of the cif:

<<snip>>
R12 "Other modification" 2011-10-25 RCSB CS 'add missing methyl group, re-define bond order based on publication'
<<snip>>

But this has mutated for some reason (premature stop codon?) over the past year to the following.

<<snip>>  
R12 "Other modification" 2011-10-25 RCSB 
<<snip>>

Obviously the full correct ligand could not have been incorporated into the PDB entry coordinates without these undergoing a full obsolete - supersede process (somewhat embarrassing perhaps as one author is now a wwPDB PI ;)

But it is frustrating for users of the PDB that in such cases easily correctable errors are not actually updated by the authors. Would it not be helpful if there were a mechanism to make and track useful improvements in deposited structures? - Perhaps suggested by members of the community to the authors. 

These changes could be considered as 'corrigenda' and could be documented and tracked - complete with an explanation of the reasoning behind the change and attributing the motivation and origin of the improvement.

This would be a good way for the wider scientific community (who maybe do not read this bulletin board) to access the best current model without the authors suffering the full process of retracting and redepositing their PDB entry. The test for obsoleting would then be the same as for a paper - that the change invalidates a fundamental interpretation of the data. 

All the best
  Martyn 

From: Pavel Afonine mailto:[log in to unmask]
To: [log in to unmask]
Sent: Sunday, 20 October 2013, 19:49
Subject: Re: [ccp4bb] Problematic PDBs

Hello,

just for the sake of completeness: this paper lists a bunch of known pathologies (I would not be surprised if they've been remediated by now):


Pavel


On Thu, Oct 17, 2013 at 6:51 AM, Lucas <[log in to unmask]> wrote:
Dear all,

I've been lecturing in a structural bioinformatics course where graduate students (always consisting of people without crystallography background to that point) are expected to understand the basics on how x-ray structures are obtained, so that they know what they are using in their bioinformatics projects. Practices include letting them manually build a segment from an excellent map and also using Coot to check problems in not so good structures.

I wonder if there's a list of problematic structures somewhere that I could use for that practice? Apart from a few ones I'm aware of because of (bad) publicity, what I usually do is an advanced search on PDB for entries with poor resolution and bound ligands, then checking then manually, hopefully finding some examples of creative map interpretation. But it would be nice to have specific examples for each thing that can go wrong in a PDB construction.

Best regards,
Lucas