I was at the PDB from 1974 - 1998 and closely involved with processing entries 15 to ~9000. We also designed the "PDB format". My replies were based on what was done for those 24 years and I cannot address what is currently being done at the PDB. I do not know if the current PDB staff follows this bulletin board and I can only suggest that you take this matter up with the current PDB management, the community, and the PDB advisory board. Frances ===================================================== **** Bernstein + Sons * * Information Systems Consultants **** 5 Brewster Lane, Bellport, NY 11713-2803 * * *** **** * Frances C. Bernstein * *** [log in to unmask] *** * * *** 1-631-286-1339 FAX: 1-631-286-1999 ===================================================== On Fri, 19 Sep 2008, Linda Brinen wrote: > I'm actually pleased to read your response and interpretation of what is > allowable and why, Frances. However, it's it pretty stark contrast to what I > was told about 18 months ago when I struggled (and eventually lost) to > preserve a numbering scheme that had a long standing historical and > literature precedence when submitting a new structure to the PDB. > > This was a two-domain protein; the first domain - according to historical > numbering - had a number plus a letter code to indicate the domain; the > second domain, which started again with the number 1 - had no letter code. > We were told that that was not allowed. We wanted to preserve insertions and > deletions as well, but were also strongly discouraged, if not flat out told > we could not. While it's not usually prudent to quote offline e-mail > exchanges, I'm going to snip pertinent pieces of the discussion (I'm leaving > the original spelling errors and text bolding in place) with no indication > of the annotator who wrote these guidelines to our group. Here's part of one > of the many 'exchanges' that was had: > > "I understand your point and that certain close research communities have > certain habits and traditions but the PDB serves to the whole community of > structural biology, bioinformatics, to many educators, students... In all > these cases, the simplest possible numbering of sequences, ideally numbering > identical to the numbering used by the UNP sequence database, is far the most > useful because easiest to understand. I do not say this because it is in our > manuals and help pages but because I have eight years of experience with > annotation of all kinds of structures. I would therefore very much like to > ask you to reconsider the way how you number your protein, your numbering > schema is *interpretation* more than a mere labeling schema. Needles to say, > no sequence numbering can satisfy this ambition...from my point of view, > especially the jump from 96P back to 1 will cause a lot of confusion and > misunderstanding....look at the problem from a standpoint of a general > naturalist instead of an narrow protease community" > > > This left us with a mandated 'start from 1 and number sequentially' format > that did exactly the opposite of what you, Frances, correctly mention as > important in any numbering scheme: preserve relationships with other > proteins. We've had to resort to providing 'translation tables' that > identify what people were expecting to see as numbers for active site > residues which now have new and non-sensical numbering. Is it the end of > the world? Of course not. But neither is it necessarily the best scientific > or logical presentation. > > At the risk of inciting a rather....animated...dialogue on this topic, what > has your experience been with this kind of thing (i.e., were we just > unlucky??) and do current practices make sense and serve the community?? > > -Linda > > > Frances C. Bernstein wrote: >> All entries list atoms starting at the N-terminus (or 5') so >> connectivity goes in the order of the atoms in the file - >> obviously with the possibility of unconnected portions >> where the density is inadequate. >> >> The entire philosphy of allowing numbering other than 1 - N >> had to do with preserving relationships with other proteins. >> The most common use relates to having an initial sequence 1 - N >> and then a similar sequence from another species with insertions >> and/or gaps. People wanted to be able to talk about the active >> site (which was preserved) using the same residue numbers. >> Negative numbers came up with additions at the N-terminus. >> Offhand, I don't recall why descending numbers were used but >> I believe that there is at least one such entry. >> >> Frances >> ===================================================== >> **** Bernstein + Sons >> * * Information Systems Consultants >> **** 5 Brewster Lane, Bellport, NY 11713-2803 >> * * *** >> **** * Frances C. Bernstein >> * *** [log in to unmask] >> *** * >> * *** 1-631-286-1339 FAX: 1-631-286-1999 >> ===================================================== >> >> On Fri, 19 Sep 2008, Ian Tickle wrote: >> >>> >>> But what connectivity would be implied by descending numbers: the order >>> in the file or the order of the numbering? I assume the former, >>> otherwise what would be the point of having descending numbering? And I >>> wonder how many programs would baulk at it (or even at ascending >>> negative numbers?). >>> >>> -- Ian >>> >>>> -----Original Message----- >>>> From: [log in to unmask] [mailto:[log in to unmask]] >>> On >>>> Behalf Of Frances C. Bernstein >>>> Sent: 19 September 2008 16:44 >>>> To: Todd Geders >>>> Cc: [log in to unmask] >>>> Subject: Re: [ccp4bb] Non-sequential residue numbering? >>>> >>>> As long as each residue within a chain has a unique identifier >>>> (residue number plus insertion code), there is no restriction >>>> on numbering. The numbers can be in ascending or descending >>>> order, non-sequential, and even negative. >>>> >>>> Frances >>>> >>>> ===================================================== >>>> **** Bernstein + Sons >>>> * * Information Systems Consultants >>>> **** 5 Brewster Lane, Bellport, NY 11713-2803 >>>> * * *** >>>> **** * Frances C. Bernstein >>>> * *** [log in to unmask] >>>> *** * >>>> * *** 1-631-286-1339 FAX: 1-631-286-1999 >>>> ===================================================== >>>> >>>> On Fri, 19 Sep 2008, Todd Geders wrote: >>>> >>>>> Hello all, >>>>> >>>>> I have a structure from a non-natural fusion of the truncated >>> C-terminus >>>> of >>>>> one protein with the truncated N-terminus of another. For the >>>> deposition, we >>>>> want to keep the numbering as found in the separate proteins. It >>> looks >>>>> something like this: >>>>> >>>>> 1 12 >>>>> | | >>>>> ....HWVCKDIALLMCFFLEEMSEEP.... >>>>> | | >>>>> 754 763 >>>>> >>>>> At no point is there an overlap in numbering (i.e. the N-terminal >>>> residue >>>>> number is higher than the C-terminal residue number). >>>>> >>>>> Is this numbering scheme supported by the PDB standard? Thus far, >>> all >>>> of the >>>>> software seems to handle it (refmac, Coot, PyMOL, pdb_extract, PDB >>>> precheck & >>>>> validation, etc). >>>>> >>>>> Can anyone see a reason to not deposit with this non-sequential >>> residue >>>>> numbering? >>>>> >>>>> ~Todd >>> >>> >>> >>> Disclaimer >>> This communication is confidential and may contain privileged information >>> intended solely for the named addressee(s). It may not be used or >>> disclosed except for the purpose for which it has been sent. If you are >>> not the intended recipient you must not review, use, disclose, copy, >>> distribute or take any action in reliance upon it. If you have received >>> this communication in error, please notify Astex Therapeutics Ltd by >>> emailing [log in to unmask] and destroy all copies of the >>> message and any attached documents. >>> Astex Therapeutics Ltd monitors, controls and protects all its messaging >>> traffic in compliance with its corporate email policy. The Company accepts >>> no liability or responsibility for any onward transmission or use of >>> emails and attachments having left the Astex Therapeutics domain. Unless >>> expressly stated, opinions in this message are those of the individual >>> sender and not of Astex Therapeutics Ltd. The recipient should check this >>> email and any attachments for the presence of computer viruses. Astex >>> Therapeutics Ltd accepts no liability for damage caused by any virus >>> transmitted by this email. E-mail is susceptible to data corruption, >>> interception, unauthorized amendment, and tampering, Astex Therapeutics >>> Ltd only send and receive e-mails on the basis that the Company is not >>> liable for any such alteration or any consequences thereof. >>> Astex Therapeutics Ltd., Registered in England at 436 Cambridge Science >>> Park, Cambridge CB4 0QA under number 3751674 >>> >>> >>> > > > -- > Linda S. Brinen > Adjunct Assistant Professor > Dept of Cellular & Molecular Pharmacology and > The Sandler Center for Basic Research in Parasitic Diseases > Phone: 415-514-3426 FAX: 415-502-8193 > E-mail: [log in to unmask] > QB3/Byers Hall 508C > 1700 4th Street > University of California > San Francisco, CA 94158-2550 > USPS: > UCSF MC 2550 > Byers Hall Room 508 > 1700 4th Street > San Francisco, CA 94158 >