JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for CCP4BB Archives


CCP4BB Archives

CCP4BB Archives


CCP4BB@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

CCP4BB Home

CCP4BB Home

CCP4BB  September 2008

CCP4BB September 2008

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Re: Non-sequential residue numbering?

From:

"Herbert J. Bernstein" <[log in to unmask]>

Reply-To:

Herbert J. Bernstein

Date:

Fri, 19 Sep 2008 15:02:42 -0400

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (292 lines)

I would suggest depositors take a look at the PDB Exchange
Dictionary and at the following definitions:

_atom_site.auth_seq_id
                An alternative identifier for _atom_site.label_seq_id that
                may be provided by an author in order to match the 
identification
                used in the publication that describes the structure.

                Note that this is not necessarily a number, that the values do
                not have to be positive, and that the value does not have to
                correspond to the value of _atom_site.label_seq_id. The value
                of _atom_site.label_seq_id is required to be a sequential list
                of positive integers.

                The author may assign values to _atom_site.auth_seq_id in any
                desired way. For instance, the values may be used to relate
                this structure to a numbering scheme in a homologous structure,
                including sequence gaps or insertion codes. Alternatively, a
                scheme may be used for a truncated polymer that maintains the
                numbering scheme of the full length polymer. In all cases, the
                scheme used here must match the scheme used in the publication
                that describes the structure.

_atom_site.label_seq_id
                This data item is a pointer to _entity_poly_seq.num in the
                ENTITY_POLY_SEQ category.

_entity_poly_seq.num
                The value of _entity_poly_seq.num must uniquely and sequentially
                identify a record in the ENTITY_POLY_SEQ list.

                Note that this item must be a number and that the sequence
                numbers must progress in increasing numerical order.

So, at the very least, the PDB's internal database and mmCIF and PDBML
files should be able to handle _both_ the simplified numbering the
annotator wishes to impose, and the more scientifically useful notation
an author might use to place their structure in context.  It should be
a "simple" matter of programming for the PDB to produce "PDB" entries done
either way.

One should also note the the entire system of insertion codes does not
make much sense without the broader contextual view of families of
structures.

Regards,
   Herbert


At 2:33 PM -0400 9/19/08, Frances C. Bernstein wrote:
>I was at the PDB from 1974 - 1998 and closely involved with
>processing entries 15 to ~9000.  We also designed the "PDB
>format".  My replies were based on what was done for those 24
>years and I cannot address what is currently being done at the PDB.
>
>I do not know if the current PDB staff follows this bulletin
>board and I can only suggest that you take this matter up
>with the current PDB management, the community, and the PDB
>advisory board.
>
>                              Frances
>
>=====================================================
>****                Bernstein + Sons
>*   *       Information Systems Consultants
>****    5 Brewster Lane, Bellport, NY 11713-2803
>*   * ***
>**** *            Frances C. Bernstein
>   *   ***      [log in to unmask]
>  ***     *
>   *   *** 1-631-286-1339    FAX: 1-631-286-1999
>=====================================================
>
>On Fri, 19 Sep 2008, Linda Brinen wrote:
>
>>I'm actually pleased to read your response and interpretation of 
>>what is allowable and why, Frances. However, it's it pretty stark 
>>contrast to what I was told about 18 months ago when I struggled 
>>(and eventually lost) to preserve a numbering scheme that had a 
>>long standing historical and literature precedence when submitting 
>>a new structure to the PDB.
>>
>>This was a two-domain protein; the first domain - according to 
>>historical numbering - had a number plus a letter code to indicate 
>>the domain; the second domain, which started again with the number 
>>1 - had no letter code. We were told that that was not allowed. We 
>>wanted to preserve insertions and deletions as well, but were also 
>>strongly discouraged, if not flat out told we could not. While it's 
>>not usually prudent to quote offline e-mail exchanges, I'm going to 
>>snip pertinent pieces of the discussion (I'm leaving the original 
>>spelling errors and text bolding in place)  with no indication of 
>>the annotator who wrote these guidelines to our group.  Here's part 
>>of one of the many 'exchanges' that was had:
>>
>>"I understand your point and that certain close research 
>>communities have certain habits and traditions but the PDB serves 
>>to the whole community of structural biology, bioinformatics, to 
>>many educators, students... In all these cases, the simplest 
>>possible numbering of sequences, ideally numbering identical to the 
>>numbering used by the UNP sequence database, is far the most useful 
>>because easiest to understand.  I do not say this because it is in 
>>our manuals and help pages but because I have eight years of 
>>experience with annotation of all kinds of structures. I would 
>>therefore very much like to ask you to reconsider the way how you 
>>number your protein, your numbering schema is *interpretation* more 
>>than a mere labeling schema. Needles to say, no sequence numbering 
>>can satisfy this ambition...from my point of view, especially the 
>>jump from 96P back to 1 will cause a lot of confusion and 
>>misunderstanding....look at the problem from a standpoint of a 
>>general naturalist instead of an narrow protease community"
>>
>>
>>This left us with a mandated 'start from 1 and number sequentially' 
>>format that did exactly the opposite of what you, Frances, 
>>correctly mention as important in any numbering scheme: preserve 
>>relationships with other proteins.  We've had to resort to 
>>providing 'translation tables' that identify what people were 
>>expecting to see as numbers for active site residues which now have 
>>new and non-sensical numbering.   Is it the end of the world? Of 
>>course not. But neither is it necessarily the best scientific or 
>>logical presentation.
>>
>>At the risk of inciting a rather....animated...dialogue on this 
>>topic, what has your experience been with this kind of thing (i.e., 
>>were we just unlucky??) and do current practices make sense and 
>>serve the community??
>>
>>-Linda
>>
>>
>>Frances C. Bernstein wrote:
>>>All entries list atoms starting at the N-terminus (or 5') so
>>>connectivity goes in the order of the atoms in the file -
>>>obviously with the possibility of unconnected portions
>>>where the density is inadequate.
>>>
>>>The entire philosphy of allowing numbering other than 1 - N
>>>had to do with preserving relationships with other proteins.
>>>The most common use relates to having an initial sequence 1 - N
>>>and then a similar sequence from another species with insertions
>>>and/or gaps.  People wanted to be able to talk about the active
>>>site (which was preserved) using the same residue numbers.
>>>Negative numbers came up with additions at the N-terminus.
>>>Offhand, I don't recall why descending numbers were used but
>>>I believe that there is at least one such entry.
>>>
>>>                        Frances
>>>=====================================================
>>>****                Bernstein + Sons
>>>*   *       Information Systems Consultants
>>>****    5 Brewster Lane, Bellport, NY 11713-2803
>>>*   * ***
>>>**** *            Frances C. Bernstein
>>>   *   ***      [log in to unmask]
>>>  ***     *
>>>   *   *** 1-631-286-1339    FAX: 1-631-286-1999
>>>=====================================================
>>>
>>>On Fri, 19 Sep 2008, Ian Tickle wrote:
>>>
>>>>
>>>>But what connectivity would be implied by descending numbers: the order
>>>>in the file or the order of the numbering?  I assume the former,
>>>>otherwise what would be the point of having descending numbering?  And I
>>>>wonder how many programs would baulk at it (or even at ascending
>>>>negative numbers?).
>>>>
>>>>-- Ian
>>>>
>>>>>-----Original Message-----
>>>>>From: [log in to unmask] [mailto:[log in to unmask]]
>>>>On
>>>>>Behalf Of Frances C. Bernstein
>>>>>Sent: 19 September 2008 16:44
>>>>>To: Todd Geders
>>>>>Cc: [log in to unmask]
>>>>>Subject: Re: [ccp4bb] Non-sequential residue numbering?
>>>>>
>>>>>As long as each residue within a chain has a unique identifier
>>>>>(residue number plus insertion code), there is no restriction
>>>>>on numbering.  The numbers can be in ascending or descending
>>>>>order, non-sequential, and even negative.
>>>>>
>>>>>                         Frances
>>>>>
>>>>>=====================================================
>>>>>****                Bernstein + Sons
>>>>>*   *       Information Systems Consultants
>>>>>****    5 Brewster Lane, Bellport, NY 11713-2803
>>>>>*   * ***
>>>>>**** *            Frances C. Bernstein
>>>>>    *   ***      [log in to unmask]
>>>>>   ***     *
>>>>>    *   *** 1-631-286-1339    FAX: 1-631-286-1999
>>>>>=====================================================
>>>>>
>>>>>On Fri, 19 Sep 2008, Todd Geders wrote:
>>>>>
>>>>>>Hello all,
>>>>>>
>>>>>>I have a structure from a non-natural fusion of the truncated
>>>>C-terminus
>>>>>of
>>>>>>one protein with the truncated N-terminus of another.  For the
>>>>>deposition, we
>>>>>>want to keep the numbering as found in the separate proteins.  It
>>>>looks
>>>>>>something like this:
>>>>>>
>>>>>>             1         12
>>>>>>             |          |
>>>>>>....HWVCKDIALLMCFFLEEMSEEP....
>>>>>>   |        |
>>>>>>754      763
>>>>>>
>>>>>>At no point is there an overlap in numbering (i.e. the N-terminal
>>>>>residue
>>>>>>number is higher than the C-terminal residue number).
>>>>>>
>>>>>>Is this numbering scheme supported by the PDB standard?  Thus far,
>>>>all
>>>>>of the
>>>>>>software seems to handle it (refmac, Coot, PyMOL, pdb_extract, PDB
>>>>>precheck &
>>>>>>validation, etc).
>>>>>>
>>>>>>Can anyone see a reason to not deposit with this non-sequential
>>>>residue
>>>>>>numbering?
>>>>>>
>>>>>>~Todd
>>>>
>>>>
>>>>
>>>>Disclaimer
>>>>This communication is confidential and may contain privileged 
>>>>information intended solely for the named addressee(s). It may 
>>>>not be used or disclosed except for the purpose for which it has 
>>>>been sent. If you are not the intended recipient you must not 
>>>>review, use, disclose, copy, distribute or take any action in 
>>>>reliance upon it. If you have received this communication in 
>>>>error, please notify Astex Therapeutics Ltd by emailing 
>>>>[log in to unmask] and destroy all copies of the 
>>>>message and any attached documents.
>>>>Astex Therapeutics Ltd monitors, controls and protects all its 
>>>>messaging traffic in compliance with its corporate email policy. 
>>>>The Company accepts no liability or responsibility for any onward 
>>>>transmission or use of emails and attachments having left the 
>>>>Astex Therapeutics domain.  Unless expressly stated, opinions in 
>>>>this message are those of the individual sender and not of Astex 
>>>>Therapeutics Ltd. The recipient should check this email and any 
>>>>attachments for the presence of computer viruses. Astex 
>>>>Therapeutics Ltd accepts no liability for damage caused by any 
>>>>virus transmitted by this email. E-mail is susceptible to data 
>>>>corruption, interception, unauthorized amendment, and tampering, 
>>>>Astex Therapeutics Ltd only send and receive e-mails on the basis 
>>>>that the Company is not liable for any such alteration or any 
>>>>consequences thereof.
>>>>Astex Therapeutics Ltd., Registered in England at 436 Cambridge 
>>>>Science Park, Cambridge CB4 0QA under number 3751674
>>>>
>>>>
>>
>>
>>--
>>Linda S. Brinen
>>Adjunct Assistant Professor
>>Dept of Cellular & Molecular Pharmacology and
>>The Sandler Center for Basic Research in Parasitic Diseases
>>Phone: 415-514-3426 FAX: 415-502-8193
>>E-mail: [log in to unmask]
>>QB3/Byers Hall 508C
>>1700 4th Street
>>University of California
>>San Francisco, CA 94158-2550
>>USPS:
>>UCSF MC 2550
>>Byers Hall Room 508
>>1700 4th Street
>>San Francisco, CA 94158


-- 
=====================================================
  Herbert J. Bernstein, Professor of Computer Science
    Dowling College, Kramer Science Center, KSC 121
         Idle Hour Blvd, Oakdale, NY, 11769

                  +1-631-244-3035
                  [log in to unmask]
=====================================================

Top of Message | Previous Page | Permalink

JISCMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007


WWW.JISCMAIL.AC.UK

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager