JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for CCP4BB Archives


CCP4BB Archives

CCP4BB Archives


CCP4BB@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

CCP4BB Home

CCP4BB Home

CCP4BB  April 2011

CCP4BB April 2011

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Re: what to do with disordered side chains

From:

Dale Tronrud <[log in to unmask]>

Reply-To:

Dale Tronrud <[log in to unmask]>

Date:

Tue, 5 Apr 2011 00:04:53 -0700

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (211 lines)

On 4/4/2011 2:15 PM, Jacob Keller wrote:
> I like your IMGATM proposal, but wouldn't it also potentially break
> some of the programs?

    That depends on the program.  Programs I write that read PDB files
silently ignore keywords that they don't recognize.  A model with
IMGATM (or whatever keyword you standardize on) records would be
interpreted as those those dummy atoms don't exist.  If a program
died because of them, or if the PDB consumer wanted to "see" the
dummy atoms the keywords could be replaced with ATOM using a text
editor and a global substitute, and the user would be aware that
there is something different about those atoms.

    I would hope programs would be modified to do sensible things
with the dummy atoms since they would have a clear indication that
the atoms are indeed dummy.  For a graphics program, maybe the bonds
involving dummy atoms could be drawn a half brightness.  They would
be visible but clearly more ghost-like than the majority
of atoms in the model.  A refinement program could strip them out,
perform the refinement, and rebuild them at the end, if needed,
using WASNIAHC.  I expect they would also be ignored completely in
MR and homology modeling/comparison programs.  In fact, pretty much
any use I would make of the PDB file would involve discarding all
the dummy atoms, but with this scheme I could at least know for
sure which atoms are fantasy and which were build based on density.

>Also--and this is a problem with deleting only
> sidechain atoms in general--it seems that many, myself included, might
> totally miss that an apparent "alanine" is really a trunco-lysine.
> What I like is that it does get around the problem of people
> over-interpreting bogus sidechains, but it falls short, perhaps, in
> misleading people about what residue is there. I, for one, would not
> feel that I had to click on all the alanines in a model to verify that
> they were not lysines, and would be surprised and puzzled for a while
> about why this ala said lys when I clicked on it. Wouldn't you be
> surprised? (Well, maybe not after this thread...)

    I am surprised any time I see all the atoms in a lysine on the surface.
"What could possibly be holding that thing in place?" is what jumps to my
mind.  When I see a side chain on the surface that ends at CB or CG I
just assume it is something long and waving in the breeze.  I guess it
all depends on what you are used to looking at.

    With dummy atoms that are clearly labeled as such then the graphics
programs can be programed as I described above and we both would have
the visual cues that we desire.

    Another advantage of keeping the "dummy flag" separate from the occupancy
and B factor fields is that these are then free to be used in the way
they were intended.  Numerous times I have built side chains that are
visible to their end, but a second conformation ends at the CG.  I split
these side chains into A and B parts with a complete A and a partial B and
the group occupancies of A and B sum to 1.0.  Now if you tell me that
I have to build the entire B side chain and must flag the dummy atoms
with occ=0.0 we have a problem.  For the dummy atoms the occupancies don't
sum to 1.0 any more.  Logic tells me that the occupancy of the dummy atoms
should be the same as all the real B atoms.

    This particular case is a good example of why I don't like the idea
of building complete side chains in the absence of density.  If you are
going to build out my B conformation you have to recognize that the reason
I don't see density beyond the CG is that there is a B and C conformation
for the next CD atom (remember I already have an A conformation for CD
elsewhere).  To make a logically complete side chain I need to build
two dummy conformations for this residue and split my "real" CG, CB, and
CA B conformation atoms with no way to decide the relative occupancies of
the B and C conformations.  That's a lot of complexity for a blurry bit of
density.  Hell, I have every reason to expect that there is a D conformation
in there too - do I have to build that as well?

    If you expect such a shrub to be built for every surface lysine the
IMGATM keyword and the program WASNIAHC would allow it to be generated
and represented in an unambiguous and minimally confusing fashion.  I
wouldn't be happy having to add imaginary atoms to my models, but the
representation meets my criteria, and I think it meets yours too.

Dale Tronrud

>
> JPK
>
>
>
> On Mon, Apr 4, 2011 at 1:55 AM, Dale Tronrud<[log in to unmask]>  wrote:
>>    The definition of _atom_site.occupancy is
>>
>>   The fraction of the atom type present at this site.
>>   The sum of the occupancies of all the atom types at this site
>>   may not significantly exceed 1.0 unless it is a dummy site.
>>
>> When an atom has an occupancy equal to zero that means that the
>> atom is NEVER present at that site - and that is not what you
>> intend to say.  Setting the occupancy to zero does not mean that
>> a full atom is located somewhere in this area.  Quite the opposite.
>>
>>    (The reference to a dummy site is interesting and implies to
>> me that mmCIF already has the mechanism you wish for.)
>>
>>    Having some experience with refining low occupancy atoms and
>> working with dummy marker atoms I'm quite confident that you can
>> never define a B factor cutoff that would work.  No matter what
>> value you choose you will find some atoms in density that refine
>> to values greater than the cutoff, or the limit you choose is so
>> high that you will find marker atoms that refine to less than the
>> limit.  A B factor cutoff cannot work - no matter the value you
>> choose you will always be plagued with false positives or false
>> negatives.
>>
>>    If you really want to stuff this bit into one of these fields
>> you have to go all out.  Set the occupancy of a marker atom to -99.99.
>> This will unambiguously mark the atom as an imaginary one.  This
>> will, of course, break every program that reads PDB format files,
>> but that is what should happen in any case.  If you change the
>> definition of the columns in the file you must mandate that all
>> programs be upgraded to recognized the new definitions.  I don't
>> know how you can do that other than ensuring that the change will
>> cause programs to cough.  To try to slide it by with a magic value
>> that will be silently accepted by existing programs is to beg for
>> bugs and subtle side-effects.
>>
>>    Good luck getting the maintainers of the mmCIF standard to accept
>> a magic value in either of these fields.
>>
>>    How about this: We already have the keywords ATOM and HETATM
>> (and don't ask me why we have two).  How about we create a new
>> record in the PDB format, say IMGATM, that would have all the
>> fields of an ATOM record but would be recognized as whatever the
>> marker is for "dummy" atoms in the current mmCIF?  Existing programs
>> would completely ignore these atoms, as they should until they are
>> modified to do something reasonable with them.  Those of us who
>> have no use for them can either use a switch in the program to
>> ignore them or just grep them out of the file.  Someone could write
>> a program that would take a model with only ATOM and HETATM records
>> and fill out all the desired IMGATM records (Let's call that program
>> WASNIAHC, everyone would remember that!).
>>
>>    This solution is unambiguous.  It can be represented in current
>> mmCIF, I think.  The PDB could run WASNIAHC themselves after deposition
>> but before acceptance by the depositor so people like me would not
>> have to deal with them during refinement but would be able to see
>> them before our precious works of art are unleashed on the world.
>>
>>    Seems like a win-win solution to me.
>>
>> Dale Tronrud
>>
>>
>> On 4/3/2011 9:17 PM, Jacob Keller wrote:
>>>
>>> Well, what about getting the default settings on the major molecular
>>> viewers to hide atoms with either occ=0 or b>cutoff ("novice mode?")?
>>> While the b cutoff is still be tricky, I assume we could eventually
>>> come to consensus on some reasonable cutoff (2 sigma from the mean?),
>>> and then this approach would allow each free-spirited crystallographer
>>> to keep his own preferred method of dealing with these troublesome
>>> sidechains and nary a novice would be led astray....
>>>
>>> JPK
>>>
>>> On Sun, Apr 3, 2011 at 2:58 PM, Eric Bennett<[log in to unmask]>    wrote:
>>>>
>>>> Most non-structural users are familiar with the sequence of the proteins
>>>> they are studying, and most software does at least display residue identity
>>>> if you select an atom in a residue, so usually it is not necessary to do any
>>>> cross checking besides selecting an atom in the residue and seeing what its
>>>> residue name is.  The chance of somebody misinterpreting a truncated Lys as
>>>> Ala is, in my experience, much much lower than the chance they will trust
>>>> the xyz coordinates of atoms with zero occupancy or high B factors.
>>>>
>>>> What worries me the most is somebody designing a whole biological
>>>> experiment around an over-interpretation of details that are implied by xyz
>>>> coordinates of atoms, even if those atoms were not resolved in the maps.
>>>>   When this sort of error occurs it is a level of pain and wasted effort that
>>>> makes the "pain" associated with having to build back in missing side chains
>>>> look completely trivial.
>>>>
>>>> As long as the PDB file format is the way users get structural data,
>>>> there is really no good way to communicate "atom exists with no reliable
>>>> coordinates" to the user, given the diversity of software packages out there
>>>> for reading PDB files and the historical lack of any standard way of dealing
>>>> with this issue.  Even if the file format is hacked there is no way to force
>>>> all the existing software out there to understand the hack.  A file format
>>>> that isn't designed with this sort of feature from day one is not going to
>>>> be fixable as a practical matter after so much legacy code has accumulated.
>>>>
>>>> -Eric
>>>>
>>>>
>>>>
>>>> On Apr 3, 2011, at 2:20 PM, Jacob Keller wrote:
>>>>
>>>>> To the delete-the-atom-nik's: do you propose deleting the whole
>>>>> residue or just the side chain? I can understand deleting the whole
>>>>> residue, but deleting only the side chain seems to me to be placing a
>>>>> stumbling block also, and even possibly confusing for an experienced
>>>>> crystallographer: the .pdb says "lys" but it looks like an ala? Which
>>>>> is it? I could imagine a lot of frustration-hours arising from this
>>>>> practice, with people cross-checking sequences, looking in the methods
>>>>> sections for mutations...
>>>>>
>>>>> JPK
>>>>>
>>>>
>>>
>>>
>>>
>>
>
>
>

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

April 2024
March 2024
February 2024
January 2024
December 2023
November 2023
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
December 2022
November 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager