JISCMail - CCP4BB Archives

I was at the PDB from 1974 - 1998 and closely involved with
processing entries 15 to ~9000.  We also designed the "PDB
format".  My replies were based on what was done for those 24
years and I cannot address what is currently being done at the PDB.

I do not know if the current PDB staff follows this bulletin
board and I can only suggest that you take this matter up
with the current PDB management, the community, and the PDB
advisory board.

                              Frances

=====================================================
****                Bernstein + Sons
*   *       Information Systems Consultants
****    5 Brewster Lane, Bellport, NY 11713-2803
*   * ***
**** *            Frances C. Bernstein
   *   ***      [log in to unmask]
  ***     *
   *   *** 1-631-286-1339    FAX: 1-631-286-1999
=====================================================

On Fri, 19 Sep 2008, Linda Brinen wrote:

> I'm actually pleased to read your response and interpretation of what is 
> allowable and why, Frances. However, it's it pretty stark contrast to what I 
> was told about 18 months ago when I struggled (and eventually lost) to 
> preserve a numbering scheme that had a long standing historical and 
> literature precedence when submitting a new structure to the PDB.
>
> This was a two-domain protein; the first domain - according to historical 
> numbering - had a number plus a letter code to indicate the domain; the 
> second domain, which started again with the number 1 - had no letter code. 
> We were told that that was not allowed. We wanted to preserve insertions and 
> deletions as well, but were also strongly discouraged, if not flat out told 
> we could not. While it's not usually prudent to quote offline e-mail 
> exchanges, I'm going to snip pertinent pieces of the discussion (I'm leaving 
> the original spelling errors and text bolding in place)  with no indication 
> of the annotator who wrote these guidelines to our group.  Here's part of one 
> of the many 'exchanges' that was had:
>
> "I understand your point and that certain close research communities have 
> certain habits and traditions but the PDB serves to the whole community of 
> structural biology, bioinformatics, to many educators, students... In all 
> these cases, the simplest possible numbering of sequences, ideally numbering 
> identical to the numbering used by the UNP sequence database, is far the most 
> useful because easiest to understand.  I do not say this because it is in our 
> manuals and help pages but because I have eight years of experience with 
> annotation of all kinds of structures. I would therefore very much like to 
> ask you to reconsider the way how you number your protein, your numbering 
> schema is *interpretation* more than a mere labeling schema. Needles to say, 
> no sequence numbering can satisfy this ambition...from my point of view, 
> especially the jump from 96P back to 1 will cause a lot of confusion and 
> misunderstanding....look at the problem from a standpoint of a general 
> naturalist instead of an narrow protease community"
>
>
> This left us with a mandated 'start from 1 and number sequentially' format 
> that did exactly the opposite of what you, Frances, correctly mention as 
> important in any numbering scheme: preserve relationships with other 
> proteins.  We've had to resort to providing 'translation tables' that 
> identify what people were expecting to see as numbers for active site 
> residues which now have new and non-sensical numbering.   Is it the end of 
> the world? Of course not. But neither is it necessarily the best scientific 
> or logical presentation.
>
> At the risk of inciting a rather....animated...dialogue on this topic, what 
> has your experience been with this kind of thing (i.e., were we just 
> unlucky??) and do current practices make sense and serve the community??
>
> -Linda
>
>
> Frances C. Bernstein wrote:
>> All entries list atoms starting at the N-terminus (or 5') so
>> connectivity goes in the order of the atoms in the file -
>> obviously with the possibility of unconnected portions
>> where the density is inadequate.
>> 
>> The entire philosphy of allowing numbering other than 1 - N
>> had to do with preserving relationships with other proteins.
>> The most common use relates to having an initial sequence 1 - N
>> and then a similar sequence from another species with insertions
>> and/or gaps.  People wanted to be able to talk about the active
>> site (which was preserved) using the same residue numbers.
>> Negative numbers came up with additions at the N-terminus.
>> Offhand, I don't recall why descending numbers were used but
>> I believe that there is at least one such entry.
>>
>>                        Frances
>> =====================================================
>> ****                Bernstein + Sons
>> *   *       Information Systems Consultants
>> ****    5 Brewster Lane, Bellport, NY 11713-2803
>> *   * ***
>> **** *            Frances C. Bernstein
>>   *   ***      [log in to unmask]
>>  ***     *
>>   *   *** 1-631-286-1339    FAX: 1-631-286-1999
>> =====================================================
>> 
>> On Fri, 19 Sep 2008, Ian Tickle wrote:
>> 
>>> 
>>> But what connectivity would be implied by descending numbers: the order
>>> in the file or the order of the numbering?  I assume the former,
>>> otherwise what would be the point of having descending numbering?  And I
>>> wonder how many programs would baulk at it (or even at ascending
>>> negative numbers?).
>>> 
>>> -- Ian
>>> 
>>>> -----Original Message-----
>>>> From: [log in to unmask] [mailto:[log in to unmask]]
>>> On
>>>> Behalf Of Frances C. Bernstein
>>>> Sent: 19 September 2008 16:44
>>>> To: Todd Geders
>>>> Cc: [log in to unmask]
>>>> Subject: Re: [ccp4bb] Non-sequential residue numbering?
>>>> 
>>>> As long as each residue within a chain has a unique identifier
>>>> (residue number plus insertion code), there is no restriction
>>>> on numbering.  The numbers can be in ascending or descending
>>>> order, non-sequential, and even negative.
>>>>
>>>>                         Frances
>>>> 
>>>> =====================================================
>>>> ****                Bernstein + Sons
>>>> *   *       Information Systems Consultants
>>>> ****    5 Brewster Lane, Bellport, NY 11713-2803
>>>> *   * ***
>>>> **** *            Frances C. Bernstein
>>>>    *   ***      [log in to unmask]
>>>>   ***     *
>>>>    *   *** 1-631-286-1339    FAX: 1-631-286-1999
>>>> =====================================================
>>>> 
>>>> On Fri, 19 Sep 2008, Todd Geders wrote:
>>>> 
>>>>> Hello all,
>>>>> 
>>>>> I have a structure from a non-natural fusion of the truncated
>>> C-terminus
>>>> of
>>>>> one protein with the truncated N-terminus of another.  For the
>>>> deposition, we
>>>>> want to keep the numbering as found in the separate proteins.  It
>>> looks
>>>>> something like this:
>>>>>
>>>>>             1         12
>>>>>             |          |
>>>>> ....HWVCKDIALLMCFFLEEMSEEP....
>>>>>   |        |
>>>>> 754      763
>>>>> 
>>>>> At no point is there an overlap in numbering (i.e. the N-terminal
>>>> residue
>>>>> number is higher than the C-terminal residue number).
>>>>> 
>>>>> Is this numbering scheme supported by the PDB standard?  Thus far,
>>> all
>>>> of the
>>>>> software seems to handle it (refmac, Coot, PyMOL, pdb_extract, PDB
>>>> precheck &
>>>>> validation, etc).
>>>>> 
>>>>> Can anyone see a reason to not deposit with this non-sequential
>>> residue
>>>>> numbering?
>>>>> 
>>>>> ~Todd
>>> 
>>> 
>>> 
>>> Disclaimer
>>> This communication is confidential and may contain privileged information 
>>> intended solely for the named addressee(s). It may not be used or 
>>> disclosed except for the purpose for which it has been sent. If you are 
>>> not the intended recipient you must not review, use, disclose, copy, 
>>> distribute or take any action in reliance upon it. If you have received 
>>> this communication in error, please notify Astex Therapeutics Ltd by 
>>> emailing [log in to unmask] and destroy all copies of the 
>>> message and any attached documents.
>>> Astex Therapeutics Ltd monitors, controls and protects all its messaging 
>>> traffic in compliance with its corporate email policy. The Company accepts 
>>> no liability or responsibility for any onward transmission or use of 
>>> emails and attachments having left the Astex Therapeutics domain.  Unless 
>>> expressly stated, opinions in this message are those of the individual 
>>> sender and not of Astex Therapeutics Ltd. The recipient should check this 
>>> email and any attachments for the presence of computer viruses. Astex 
>>> Therapeutics Ltd accepts no liability for damage caused by any virus 
>>> transmitted by this email. E-mail is susceptible to data corruption, 
>>> interception, unauthorized amendment, and tampering, Astex Therapeutics 
>>> Ltd only send and receive e-mails on the basis that the Company is not 
>>> liable for any such alteration or any consequences thereof.
>>> Astex Therapeutics Ltd., Registered in England at 436 Cambridge Science 
>>> Park, Cambridge CB4 0QA under number 3751674
>>> 
>>> 
>>> 
>
>
> -- 
> Linda S. Brinen
> Adjunct Assistant Professor
> Dept of Cellular & Molecular Pharmacology and
> The Sandler Center for Basic Research in Parasitic Diseases
> Phone: 415-514-3426 FAX: 415-502-8193
> E-mail: [log in to unmask]
> QB3/Byers Hall 508C
> 1700 4th Street
> University of California
> San Francisco, CA 94158-2550
> USPS:
> UCSF MC 2550
> Byers Hall Room 508
> 1700 4th Street
> San Francisco, CA 94158 
>