The pdb validation report is pretty simple to understand, and any reviewer who would have looked at this would have known about the problems. Perhaps journals should make pdb reports a central part of the review process?
JPK
-----Original Message-----
From: CCP4 bulletin board [mailto:[log in to unmask]] On Behalf Of [log in to unmask]
Sent: Wednesday, September 07, 2016 11:33 AM
To: [log in to unmask]
Subject: [ccp4bb] AW: [ccp4bb] Another puzzle: 5gnn
Dear all,
when reading the original MR pitfall thread, my first reaction was: what the heck, one needs not to be an expert car mechanic to drive a car. However, when reading the paper by Weiss et all., it appeared that the original authors were not even able to determine the correct space group! After that, no amount of GUI clicking could save them from certain disaster.
Not all is lost however, current validation programs are very powerful and errors like this are usually caught at the first nightly Buster/pdb-redo/twilight run. Anyone able to fake a wrong structure such that it passes all validation tests (Molprobity, core packing, Rfree, Ramachandran etc. etc. etc.) is so brilliant that does not need to fake anything.
As Bernard Rupp mentioned, the battle to keep protein crystallography to the experts is lost. What programmers could and should do, however, is to summarize the results of MR or validation in a way that even the most ignorant non-experts can understand: "this solution is almost certainly wrong!" for MR or "this structure is wrong!" for validation. Immediately notifying the journal that they just published garbage may sensitize them to take validation seriously.
Best,
Herman
-----Ursprüngliche Nachricht-----
Von: CCP4 bulletin board [mailto:[log in to unmask]] Im Auftrag von Gerard Bricogne
Gesendet: Mittwoch, 7. September 2016 16:20
An: [log in to unmask]
Betreff: [ccp4bb] Another puzzle: 5gnn
Dear all,
While the thread on "Another MR pi(t)fall" is still lukewarm, and the discussion it triggered hopefully still present in readers' minds, I would like to bring another puzzling entry to the BB's attention.
When reviewing on Monday the weekend's BUSTER runs on the last batch of PDB depositions, Andrew Sharff (here) noticed that entry 5gnn had been flagged as giving much larger R-values when re-refined with BUSTER (0.3590/0.3880) than the deposited ones (0.2210/0.2500). This led us to carry out some investigation of that entry.
The deposited coordinates were flagged by BUSTER as having 4602 bond-length violations, the worst being 205.8 sigmas, and other wild outliers. The initial Molprobity analysis gave a clash score of near 100, placing it in the 0-th percentile. The PDB validation report is dominantly red and ochre, with only a few wisps of green.
Examining the model and map with Coot showed "waters, waters everywhere", disconnected density, and molecules separated by large layers of water. The PDB header lists hundreds of water molecules in REMARK 525 records that are further than 5.0 Angs from the nearest chain, some of them up to 15 Angs away.
The cartoons on the NCBI server at
http://www.ncbi.nlm.nih.gov/Structure/mmdb/mmdbsrv.cgi?uid=142582&dps=1
show random coils threaded up and down through beta-strands, and the one on the RCSB PDB site at
http://www.rcsb.org/pdb/explore.do?structureId=5GNN
also shows mostly random coil, with only very few and very short segments of secondary structure.
In reciprocal space, an oddness of a different kind is that if one looks at the mtz file, the amplitudes and their sigmas are on a very small scale. However the STARANISO display shows a smooth and plausible distribution of I/sig(I) to the full nominal resolution limit of 1.6A.
Looking at the publication associated with this entry
http://www.ncbi.nlm.nih.gov/pubmed/27492925
indicates that the structure was solved by MR from a model obtained from a structure prediction server (I-TASSER). No further details are given, even in the Supplemental Material. Table 1 does report a MolProbity clash score of 103.59, as well as 10% Ramachandran outliers and 25.51% rotamer outliers. It also contains a mention of a twinning operator -h, -k, l with a twinning fraction of 0.5, although there is no mention of it in the text nor in the PDB file.
I will follow my own advice and resist the temptation of calling this "the end of civilisation as we know it", but this is startling.
Perhaps we have over-advertised to the non-experts the few successes of structure prediction programs as reliable sources of MR models and thus created unwarranted optimism, besides the usual exaggeration of the degree to which X-ray crystallography has become a push-button commodity that can deliver results to untrained users. What is also disconcerting is that the abundant alarm bells that rang along the way (the MolProbity clash score and geometry reports, the contents of the PDB validation report, and simple common sense when examining electron density and model) failed to make anyone involved along the way take notice that there was something seriously wrong.
This case seems to bring to the forefront even more vividly than
4nl6 and 4nl7 some collective issues that we face. Here the problem is not one of contamination of a protein prep resulting in crystals of "the wrong protein": there is also a more diffuse contamination by deficiencies of judgement, expertise and vigilance at several consecutive stages, including refereeing and publication.
Validation is a hot topic at the moment, and this may serve as a concrete example that some joined-up thinking and action is indeed a matter of urgency, and that extreme scenarios of things going wrong do not exist solely in the imaginations of obsessive-compulsive/paranoid validators.
I am grateful to several colleagues for correspondance and discussions on the matters touched upon on this message.
With best wishes,
Gerard
--
===============================================================
* *
* Gerard Bricogne [log in to unmask] *
* *
* Global Phasing Ltd. *
* Sheraton House, Castle Park Tel: +44-(0)1223-353033 *
* Cambridge CB3 0AX, UK Fax: +44-(0)1223-366889 *
* *
===============================================================
|