I wholeheartedly agree with George!
A single-page summary of any given PDBID, with some of the less obvious
warnings/error codes explained at the bottom, would speak volumes to almost
any reviewer - and as George pointed out it's very unlikely that reviewers
will bother to look at diffraction images, doubly so since (I think) many if
not most of people who review the 'hot' papers are not actually
crystallographers themselves. This is easy to expect given the fact that the
'high-impact' papers nowadays are generally judged by their merit to the
field of biological research, therefore people who review them (at least in
theory) are more likely to be biologists. Statistically speaking this seems
to make sense as well, since there are a few thousand crystallographers in
the world as compared to a few hundred thousand of biologists.
Sorry for making this incredibly long thread even longer.
Artem
-----Original Message-----
From: CCP4 bulletin board [mailto:[log in to unmask]] On Behalf Of George
M. Sheldrick
Sent: Saturday, August 18, 2007 9:27 AM
To: [log in to unmask]
Subject: Re: [ccp4bb] The importance of USING our validation tools
There are good reasons for preserving frames, but most of all for the
crystals that appeared to diffract but did not lead to a successful
structure solution, publication, and PDB deposition. Maybe in the future
there will be improved data processing software (for example to integrate
non-merohedral twins) that will enable good structures to be obtained from
such data. At the moment most such data is thrown away. However, forcing
everyone to deposit their frames each time they deposit a structure with
the PDB would be a thorough nuisance and major logistic hassle.
It is also a complete illusion to believe that the reviewers for Nature
etc. would process or even look at frames, even if they could download
them with the manuscript.
For small molecules, many journals require an 'ORTEP plot' to be submitted
with the paper. As older readers who have experienced Dick Harlow's 'ORTEP
of the year' competition at ACA Meetings will remember, even a viewer
with little experience of small-molecule crystallography can see from the
ORTEP plot within seconds if something is seriously wrong, and many
non-crystallographic referees for e.g. the journal Inorganic Chemistry
can even make a good guess as to what is wrong (e.g wrong element assigned
to an atom). It would be nice if we could find something similar for
macromolecules that the author would have to submit with the paper. One
immediate bonus is that the authors would look at it carefully
themselves before submitting, which could lead to an improvement of the
quality of structures being submitted. My suggestion is that the wwPDB
might provide say a one-page diagnostic summary when they allocate each
PDB ID that could be used for this purpose.
A good first pass at this would be the output that the MolProbity server
http://molprobity.biochem.duke.edu/ sends when is given a PDB file. It
starts with a few lines of summary in which bad things are marked red
and the structure is assigned to a pecentile: a percentile of 6% means
that 93% of the sturcture in the PDB with a similar resolution are
'better' and 5% are 'worse'. This summary can be understood with very
little crystallographic background and a similar summary can
of course be produced for NMR structures. The summary is followed by
diagnostics for each residue, normally if the summary looks good it
would not be necessary for the editor or referee to look at the rest.
Although this server was intended to help us to improve our structures
rather than detect manipulated or fabricated data, I asked it for a
report on 2HR0 to see what it would do (probably many other people were
trying to do exactly the same, the server was slower than usual).
Although the structure got poor marks on most tests, MolProbity
generously assigned it overall to the 6th pecentile, I suppose that
this is about par for structures submitted to Nature (!). However there
was one feature that was unlike anything I have ever seen before
although I have fed the MolProbity server with some pretty ropey PDB
files in the past: EVERY residue, including EVERY WATER molecule, made
either at least one bad contact or was a Ramachandran outlier or was a
rotamer outlier (or more than one of these). This surely would ring
all the alarm bells!
So I would suggest that the wwPDB could coordinate, with the help of the
validation experts, software to produce a short summary report that
would be automatically provided in the same email that allocates the PDB
ID. This email could make the strong recommendation that the report file
be submitted with the publication, and maybe in the fullness of time
even the Editors of high profile journals would require this report for
the referees (or even read it themselves!). To gain acceptance for such
a procedure the report would have to be short and comprehensible to
non-crystallographers; the MolProbity summary is an excellent first
pass in this respect, but (partially with a view to detecting
manipulation of the data) a couple of tests could be added based on the
data statistics as reported in the PDB file or even better the
reflection data if submitted). Most of the necessary software already
exists, much of it produced by regular readers of this bb, it just needs
to be adapted so that the results can be digested by referees and
editors with little or no crystallographic experience. And most important,
a PDB ID should always be released only in combination with such a
summary.
George
Prof. George M. Sheldrick FRS
Dept. Structural Chemistry,
University of Goettingen,
Tammannstr. 4,
D37077 Goettingen, Germany
Tel. +49-551-39-3021 or -3068
Fax. +49-551-39-2582
|