There is a collection of posts (unfortunately with a number of spam
messages) at
http://wwpdb-remediation.rutgers.edu/mail-archive/
with various comments. Although I'm not familiar with the internal
workings of this remediation program, it seems indeed that the PDB
format is now largely auto-generated from the internally used
mmCIF. Unfortunately in my experience (having had a look at a few dozen
random entries of the new PDB files) this means that some of the new
PDB files of old entries will look very different from what you/we
deposited several years ago. The format seems better (internally
consistent) but the content has sometimes suffered.
But I guess there is always room for frictions when one side is mainly
interested in data format, storage and databases and the other mainly
interested in the crystallographic content. Finding a good compromise
between those two groups of experts is non-trivial.
At least the new databases will always have a link to the original
version of the PDB file - although it will still mean I can't now
search for an author name MUELLER (German U-umlaut transfered in the
proper ASCII format), since the PDB files now contain MULLER (because
PUBMED isn't able to properly translate non-ASCII names ...). Or an
analysis of programs used for structure solution will show a veri
different distribution - since the information has been significantly
changed.
Anyway, have a look at your favourite PDB file with the attached
script
./pdb23.sh 1abc
It is quite interesting sometimes. I haven't cehcked the mmCIF files -
maybe they are much better (as a 'hint' from the database people to
the crystallographers to stop using PDB format and switch to mmCIF,
maybe?).
Cheers
Clemens
On Sat, Jul 21, 2007 at 12:05:35PM -0700, Ethan A Merritt wrote:
> On Saturday 21 July 2007 11:12, Joe Krahn wrote:
> > we all use in our daily research. They don't even want to keep the PDB
> > format at all. It's primary purpose now is for structural biologists.
>
> That is inevitable. The PDB format is simply not capable of representing
> the complexities of current crystallographic models, and will only become
> more obsolete as the state of the art progresses. Because it is so wide-
> spread, it will remain a legacy format for import/export into programs
> that are not up to the current crystallographic state of the art. Yes,
> that means it will largely be used by non-crystallographers to import
> and view structures.
>
> Thus I think the writing is on the wall that the PDB format as a primary
> working medium in crystallography is on its deathbed. Of course it may
> linger there for a long while yet, and may be poked at from time to time
> in order to stave off its final expiration.
>
> Having said that, I don't understand the motivation for changing this
> legacy format to something that the legacy programs will not recognize.
> That indeed seems self-defeating.
>
> Ethan Merritt
>
>
>
> > The new PDB format (version 3) has a lot of very useful improvements,
> > and an update is long overdue. However, I am irate that RCSB chose NOT
> > to use the ACA meeting to discuss the changes. Instead, the format is
> > being put into production at the same time as the ACA meeting. It is
> > essentially stating that opinions expressed at the ACA do not count.
> > Their was a lot of conflict at their last attempt at an update. Instead
> > of working to better involve the structural biologist community, I feel
> > that they are intentionally discounting our interests because working
> > with the user community is too much effort.
> >
> > Unfortunately, structural biologists generally do not want to spend time
> > arguing about file formats, while computer scientists can carry on for
> > weeks over minor details. This change is going to affect all of us. If
> > you have concerns about the new format that have not been addressed, it
> > is important to take action now. The PDB format is not just their
> > personal database format (that's what mmCIF is for), but the format that
> > we all use in our daily research. They don't even want to keep the PDB
> > format at all. It's primary purpose now is for structural biologists. It
> > is essential that we be part of the decision making process.
> >
> > I just sent the following letter to the wwPDB, which is where
> > comments about the new format are supposed to go. If you will be at the
> > ACA meeting, I encourage you to complain loudly.
> >
> > Joe Krahn
> >
> > -----------------------------------------------------------------------
> > To: [log in to unmask]
> > Subject: The new PDB format is WRONG.
> >
> > It seems obvious to me that the RCSB and wwPDB worked on the new format
> > to consider database users needs, but has intentionally ignored the rest
> > of the user community. RCSB manages mmCIF for database purposes, and has
> > declared a lack of interest in even keeping the PDB format. Obviously,
> > the primary purpose of the PDB format is for structural biologists
> > working with individual structures, and not database users.
> >
> > Most of the updates are quite positive and beneficial, but I think that
> > some changes are detrimental. My only serious complaint is that RCSB,
> > and now wwPDB, seem to be ignoring the interests of much of the
> > scientific community which they are supposed to be serving. All that I
> > ask for is appropriate inclusion of all of the user community. This is a
> > big change that will affect thousands of people. We should ensure that
> > it is the best possible format update before we all have to expend a
> > huge effort to deal with it.
> >
> > I have seen many comments about the format by well known
> > crystallographers ignored. One example is the use of SegID. Most
> > structural biologists have favored it for years, but RCSB continued to
> > deny us, on grounds that it is not "well defined". It would be better to
> > make a better definition, and allow it to be used to group together
> > non-covalent groups, such as waters with a specific protein molecule.
> > This is important because the use of ChainID for non-polymers has been
> > banned, which also goes against the wishes of most users.
> >
> > The latest atom alignment rule changes is also detrimental. RCSB has
> > totally broken the element alignment rules, on baseless grounds that it
> > was too hard to follow. The new change convolutes this rule even
> > further, and essentially follows an earlier attempt at IUPAC hydrogen
> > names that the community strongly rejected. At this point, the best
> > solution is probably to make it completely left justified. Again, my
> > main concern is not to follow my idea, but to ensure that the user
> > community gets a fair chance to participate in the final decision.
> >
> > Another problem is that the original meaning of HET groups continues to
> > be corrupted. ATOM records are for commonly occurring residues from a
> > list of standard residues. Water is obviously common, and should not
> > have been converted to a HET group. HET groups have NO relation ship to
> > polymeric state. With water as a HET group, a proper PDB file for a
> > modeller with bulk solvent would require CONECT entries for every single
> > water. It is also important to emphasize that the HETNAM is the actual
> > unique ID, not the 3-letter code. The current hack is to treat
> > everything as an ATOM, which has a pre-determined connectivity. This
> > cannot continue forever, and we are already stuck with meaningless
> > 3-letter codes instead of useful 3-letter abbreviations. The unique
> > 3-letter code should be continued for now, but there should be an
> > emphasis on beginning to use the full HETNAM so that the inevitable
> > switch top non-unique 3-letter codes will not have a big impact.
> >
> > Thank you,
> > Joe Krahn
> >
>
> --
> Ethan A Merritt
> Biomolecular Structure Center
> University of Washington, Seattle 98195-7742
>
--
***************************************************************
* Clemens Vonrhein, Ph.D. vonrhein AT GlobalPhasing DOT com
*
* Global Phasing Ltd.
* Sheraton House, Castle Park
* Cambridge CB3 0AX, UK
*--------------------------------------------------------------
* BUSTER Development Group (http://www.globalphasing.com)
***************************************************************
|