Interesting!
I wrote a tcsh shell script to do the same thing, but using the "reduced
cell" from the CCP4 program "tracer" instead of the given unit cell.
This "reduced cell" is generally the "bottom" P1 choice you get from
autoindexing, and using it as the comparison space takes care of all the
weirdnesses of R3 vs H3, permutations of axes, and possibly even oddball
space groups like A2, C21, F422, and others. I have placed a tarball of
my "check_for_cell.com" script (and some supporting files) here:
http://bl831.als.lbl.gov/~jamesh/pickup/reduced_cell_stuff.tgz
which like James Murray's script will require a little editing to get it
to run on your system (especially if you don't have tcsh!). Please not
that I provide this thing "as is", and it is not exactly "software". I
understand that Frank von Delft is working on converting this
reduced-cell idea into something more accessible, like a web page.
Frank? Any progress?
The only annoying part of all this is calculating the reduced cell for
the whole PDB periodically. It would be nice if the PDB would do this
for us? Seems a shame to have a "unit cell" search feature at the RCSB
website that is not "aware" of lattice symmetry.
It is interesting to note, however, that once you have reduced
everything to its "reduced cell" the space of possible unit cells is
pretty crowded. In fact, the median difference between a given PDB and
its "nearest reduced cell neighbor" (30% sequence identity cutoff) is
about 3 Angstroms (rms change in the real-space position of all three
cell vectors).
-James Holton
MAD Scientist
Murray, James W wrote:
> Dear All,
>
>
>> Re the python script to compare a unit cell to those in the PDB.
>>
>
>
>> Can you give instructions on how to run your script?
>>
>
> put the script and the gunzipped datafile somewhere in your PATH, edit the script to point to the datafile.
> Run the script with the 6 cell dimensions on the command line. If your angles are all 90degrees you can omit them.
>
> pdb_cell_scan.py 78.3 78.3 38.23 90 90 90
>
> The output is the top 10 closest pdb cells (on cell edge and angle), lower scores are closer.
>
>
> Test Cell 67.69 46.78 151.31 90.0 90.0 90.0
> PDB score cell
> ----------------------------------------------------
> 1tre 0.0 67.69 46.78 151.31 90.0 90.0 90.0
> 3cn7 103.779749 65.649 50.548 142.548 90.0 92.94 90.0
> 2d3w 124.666694 78.529 46.677 149.318 90.0 91.79 90.0
> 1gpl 179.2266 62.0 55.9 144.0 90.0 93.2 90.0
> 2d8o 218.984637 57.144 57.144 151.905 90.0 90.0 90.0
> 2d8p 219.224203 57.727 57.727 150.955 90.0 90.0 90.0
> 2vi2 219.704808 57.908 57.908 150.88 90.0 90.0 90.0
> 3e0a 219.957141 57.926 57.926 150.687 90.0 90.0 90.0
> 2oqn 220.041464 57.968 57.968 150.716 90.0 90.0 90.0
> 2g4y 220.3449 57.9 57.9 150.39 90.0 90.0 90.0
>
>
>
> best wishes
>
> James
>
> --
> Dr. James W. Murray
> David Phillips Research Fellow
> Division on Molecular Biosciences
> Imperial College, LONDON
> Tel: +44 (0)20 759 48895
> ________________________________________
> From: Chris Ulens [[log in to unmask]]
> Sent: Friday, June 11, 2010 12:56 PM
> To: Murray, James W
> Subject: Re: [ccp4bb] Common protein crystallization contaminants
>
> Very useful, thanks!
> Can you give instructions on how to run your script?
>
> -Chris
>
> On Jun 11, 2010, at 1:03 PM, Murray, James W wrote:
>
>
> Dear All,
>
> Re: Common protein crystallization contaminants
>
> thank you for all your responses. There is some literature on E. coli
> proteins, summarised below with other miscellaneous examples.
> Miscellaneous contaminants mentioned include aspartate carbamoyl
> transferase, triose phosphate isomerase from E. coli and ferritins
> from insect cells.
>
> I have written a short python script that will compare a unit cell
> with the cells in the PDB and return the 10 closest matches. (Script
> and gzipped datafile attached). This would have saved me hours of data
> collection and model-building on my last synchrotron trip. (NB.
> alternative cells and permutations of axes are not accounted for)
>
> best wishes
>
> James
>
> Summary of responses:
>
> http://www.ncbi.nlm.nih.gov/pubmed/16814929
> Biochim Biophys Acta. 2006 Sep;1760(9):1304-13.
> Structural analysis and classification of native proteins from E. coli
> commonly co-purified by immobilised metal affinity chromatography.
> Bolanos-Garcia VM, Davies OR.
>
> Ferric uptake regulator (Fur)
> Metal-binding lipocalin (YodA)
> Cu/Zn-superoxide dismutase (Cu/Zn-SODM)
> Acetylornithinase (ArgE)
> Glycogen synthase (GlgA)
> Carbonic anhydrase (YadF)
> Glucosamine-6-phosphate synthase (GlmS)
> cAMP-regulatory protein (CRP)
> Host factor-I protein (Hfq)
> Chloramphenicol-O-acetyl transferase (CAT)
> Peptidoylproline cis–trans isomerase (SlyD)
> Regulatory ribosomal protein (S15)
> Formyl transferase (YfbG)
> Glucose-6-phosphate 1-dehydrogenase (G6PD)
> GroEL/Hsp60
> Component 1 of the 2-oxoglutarate dehydrogenase complex (ODO1)
> Component E2 of the dihydrolipoamide succinyltransferase (ODO2)
> Glucose-6-phosphate 1-dehydrogenase (G6PD)
> Glucose-6-phosphate 1-dehydrogenase (G6PD)
>
> http://www.ncbi.nlm.nih.gov/pubmed/17554162
> Acta Crystallogr Sect F Struct Biol Cryst Commun. 2007 Jun 1;63(Pt 6):
> 457-61. Epub 2007 May 5.
> Purification, crystallization and structure determination of native
> GroEL from Escherichia coli lacking bound potassium ions.
> Kiser PD, Lodowski DT, Palczewski K.
>
> Carbonic anhydrase (1T75) http://www.rcsb.org/pdb/explore/explore.do?structureId=1T75
>
> Catabolite gene activator (cAMP receptor protein)
> http://www.ncbi.nlm.nih.gov/protein/P03020?report=genpept
>
> Polymyxin resistance protein PmrI
> http://www.ncbi.nlm.nih.gov/protein/6176575
>
> inorganic pyrophosphatase
> lac repressor
>
> More proteins are mentioned in this paper.
>
> http://www.ncbi.nlm.nih.gov/pubmed/19887109
> Protein Expr Purif. 2010 Apr;70(2):191-5. Epub 2009 Nov 1.
> Identification and characterization of native proteins of Escherichia
> coli BL-21 that display affinity towards Immobilized Metal Affinity
> Chromatography and Hydrophobic Interaction Chromatography Matrices.
> Tiwari N, Woods L, Haley R, Kight A, Goforth R, Clark K, Ataai M,
> Henry R, Beitle R.
>
>
> For membrane proteins purified from E. coli AcrB can be a problem, as
> well as ferritins, Omp porins and succinate dehydrogenase.
>
> http://www.ncbi.nlm.nih.gov/pubmed/19162196
> J Struct Biol. 2009 Apr;166(1):107-11.
> AcrB et al.: Obstinate contaminants in a picogram scale. One more
> bottleneck in the membrane protein structure pipeline.
>
> http://www.ncbi.nlm.nih.gov/pubmed/18931428
> Acta Crystallogr Sect F Struct Biol Cryst Commun. 2008 Oct 1;64(Pt 10):
> 880-5.
> There is a baby in the bath water: AcrB contamination is a major
> problem in membrane-protein crystallization.
>
> http://www.ncbi.nlm.nih.gov/pubmed/19770503
> Acta Crystallogr D Biol Crystallogr. 2009 Oct;65(Pt 10):1062-73.
> Effects of impurities on membrane-protein crystallization in different
> systems.
> Kors CA, Wallace E, Davies DR, Li L, Laible PD, Nollert P.
>
>
>
>
>
>
>
> --
> Dr. James W. Murray
> David Phillips Research Fellow
> Division on Molecular Biosciences
> Imperial College, LONDON
> Tel: +44 (0)20 759 48895<pdb_cell_scan.py><pdb_cells.txt.gz>
>
|