I was at the PDB from 1974 - 1998 and closely involved with
processing entries 15 to ~9000. We also designed the "PDB
format". My replies were based on what was done for those 24
years and I cannot address what is currently being done at the PDB.
I do not know if the current PDB staff follows this bulletin
board and I can only suggest that you take this matter up
with the current PDB management, the community, and the PDB
advisory board.
Frances
=====================================================
**** Bernstein + Sons
* * Information Systems Consultants
**** 5 Brewster Lane, Bellport, NY 11713-2803
* * ***
**** * Frances C. Bernstein
* *** [log in to unmask]
*** *
* *** 1-631-286-1339 FAX: 1-631-286-1999
=====================================================
On Fri, 19 Sep 2008, Linda Brinen wrote:
> I'm actually pleased to read your response and interpretation of what is
> allowable and why, Frances. However, it's it pretty stark contrast to what I
> was told about 18 months ago when I struggled (and eventually lost) to
> preserve a numbering scheme that had a long standing historical and
> literature precedence when submitting a new structure to the PDB.
>
> This was a two-domain protein; the first domain - according to historical
> numbering - had a number plus a letter code to indicate the domain; the
> second domain, which started again with the number 1 - had no letter code.
> We were told that that was not allowed. We wanted to preserve insertions and
> deletions as well, but were also strongly discouraged, if not flat out told
> we could not. While it's not usually prudent to quote offline e-mail
> exchanges, I'm going to snip pertinent pieces of the discussion (I'm leaving
> the original spelling errors and text bolding in place) with no indication
> of the annotator who wrote these guidelines to our group. Here's part of one
> of the many 'exchanges' that was had:
>
> "I understand your point and that certain close research communities have
> certain habits and traditions but the PDB serves to the whole community of
> structural biology, bioinformatics, to many educators, students... In all
> these cases, the simplest possible numbering of sequences, ideally numbering
> identical to the numbering used by the UNP sequence database, is far the most
> useful because easiest to understand. I do not say this because it is in our
> manuals and help pages but because I have eight years of experience with
> annotation of all kinds of structures. I would therefore very much like to
> ask you to reconsider the way how you number your protein, your numbering
> schema is *interpretation* more than a mere labeling schema. Needles to say,
> no sequence numbering can satisfy this ambition...from my point of view,
> especially the jump from 96P back to 1 will cause a lot of confusion and
> misunderstanding....look at the problem from a standpoint of a general
> naturalist instead of an narrow protease community"
>
>
> This left us with a mandated 'start from 1 and number sequentially' format
> that did exactly the opposite of what you, Frances, correctly mention as
> important in any numbering scheme: preserve relationships with other
> proteins. We've had to resort to providing 'translation tables' that
> identify what people were expecting to see as numbers for active site
> residues which now have new and non-sensical numbering. Is it the end of
> the world? Of course not. But neither is it necessarily the best scientific
> or logical presentation.
>
> At the risk of inciting a rather....animated...dialogue on this topic, what
> has your experience been with this kind of thing (i.e., were we just
> unlucky??) and do current practices make sense and serve the community??
>
> -Linda
>
>
> Frances C. Bernstein wrote:
>> All entries list atoms starting at the N-terminus (or 5') so
>> connectivity goes in the order of the atoms in the file -
>> obviously with the possibility of unconnected portions
>> where the density is inadequate.
>>
>> The entire philosphy of allowing numbering other than 1 - N
>> had to do with preserving relationships with other proteins.
>> The most common use relates to having an initial sequence 1 - N
>> and then a similar sequence from another species with insertions
>> and/or gaps. People wanted to be able to talk about the active
>> site (which was preserved) using the same residue numbers.
>> Negative numbers came up with additions at the N-terminus.
>> Offhand, I don't recall why descending numbers were used but
>> I believe that there is at least one such entry.
>>
>> Frances
>> =====================================================
>> **** Bernstein + Sons
>> * * Information Systems Consultants
>> **** 5 Brewster Lane, Bellport, NY 11713-2803
>> * * ***
>> **** * Frances C. Bernstein
>> * *** [log in to unmask]
>> *** *
>> * *** 1-631-286-1339 FAX: 1-631-286-1999
>> =====================================================
>>
>> On Fri, 19 Sep 2008, Ian Tickle wrote:
>>
>>>
>>> But what connectivity would be implied by descending numbers: the order
>>> in the file or the order of the numbering? I assume the former,
>>> otherwise what would be the point of having descending numbering? And I
>>> wonder how many programs would baulk at it (or even at ascending
>>> negative numbers?).
>>>
>>> -- Ian
>>>
>>>> -----Original Message-----
>>>> From: [log in to unmask] [mailto:[log in to unmask]]
>>> On
>>>> Behalf Of Frances C. Bernstein
>>>> Sent: 19 September 2008 16:44
>>>> To: Todd Geders
>>>> Cc: [log in to unmask]
>>>> Subject: Re: [ccp4bb] Non-sequential residue numbering?
>>>>
>>>> As long as each residue within a chain has a unique identifier
>>>> (residue number plus insertion code), there is no restriction
>>>> on numbering. The numbers can be in ascending or descending
>>>> order, non-sequential, and even negative.
>>>>
>>>> Frances
>>>>
>>>> =====================================================
>>>> **** Bernstein + Sons
>>>> * * Information Systems Consultants
>>>> **** 5 Brewster Lane, Bellport, NY 11713-2803
>>>> * * ***
>>>> **** * Frances C. Bernstein
>>>> * *** [log in to unmask]
>>>> *** *
>>>> * *** 1-631-286-1339 FAX: 1-631-286-1999
>>>> =====================================================
>>>>
>>>> On Fri, 19 Sep 2008, Todd Geders wrote:
>>>>
>>>>> Hello all,
>>>>>
>>>>> I have a structure from a non-natural fusion of the truncated
>>> C-terminus
>>>> of
>>>>> one protein with the truncated N-terminus of another. For the
>>>> deposition, we
>>>>> want to keep the numbering as found in the separate proteins. It
>>> looks
>>>>> something like this:
>>>>>
>>>>> 1 12
>>>>> | |
>>>>> ....HWVCKDIALLMCFFLEEMSEEP....
>>>>> | |
>>>>> 754 763
>>>>>
>>>>> At no point is there an overlap in numbering (i.e. the N-terminal
>>>> residue
>>>>> number is higher than the C-terminal residue number).
>>>>>
>>>>> Is this numbering scheme supported by the PDB standard? Thus far,
>>> all
>>>> of the
>>>>> software seems to handle it (refmac, Coot, PyMOL, pdb_extract, PDB
>>>> precheck &
>>>>> validation, etc).
>>>>>
>>>>> Can anyone see a reason to not deposit with this non-sequential
>>> residue
>>>>> numbering?
>>>>>
>>>>> ~Todd
>>>
>>>
>>>
>>> Disclaimer
>>> This communication is confidential and may contain privileged information
>>> intended solely for the named addressee(s). It may not be used or
>>> disclosed except for the purpose for which it has been sent. If you are
>>> not the intended recipient you must not review, use, disclose, copy,
>>> distribute or take any action in reliance upon it. If you have received
>>> this communication in error, please notify Astex Therapeutics Ltd by
>>> emailing [log in to unmask] and destroy all copies of the
>>> message and any attached documents.
>>> Astex Therapeutics Ltd monitors, controls and protects all its messaging
>>> traffic in compliance with its corporate email policy. The Company accepts
>>> no liability or responsibility for any onward transmission or use of
>>> emails and attachments having left the Astex Therapeutics domain. Unless
>>> expressly stated, opinions in this message are those of the individual
>>> sender and not of Astex Therapeutics Ltd. The recipient should check this
>>> email and any attachments for the presence of computer viruses. Astex
>>> Therapeutics Ltd accepts no liability for damage caused by any virus
>>> transmitted by this email. E-mail is susceptible to data corruption,
>>> interception, unauthorized amendment, and tampering, Astex Therapeutics
>>> Ltd only send and receive e-mails on the basis that the Company is not
>>> liable for any such alteration or any consequences thereof.
>>> Astex Therapeutics Ltd., Registered in England at 436 Cambridge Science
>>> Park, Cambridge CB4 0QA under number 3751674
>>>
>>>
>>>
>
>
> --
> Linda S. Brinen
> Adjunct Assistant Professor
> Dept of Cellular & Molecular Pharmacology and
> The Sandler Center for Basic Research in Parasitic Diseases
> Phone: 415-514-3426 FAX: 415-502-8193
> E-mail: [log in to unmask]
> QB3/Byers Hall 508C
> 1700 4th Street
> University of California
> San Francisco, CA 94158-2550
> USPS:
> UCSF MC 2550
> Byers Hall Room 508
> 1700 4th Street
> San Francisco, CA 94158
>
|