Last year I collected a dataset at 6 A of 2600 aa multi-domain protein.
Resolution range 50 - 6 (6.7-6)
Wilson B 370 A^2
Reflection 34120 (12240)
Rmeas 0.2 (4.4)
I-sigI 9 (1)
CC1/2 0.9 (0.5)
I have the following questions in my mind.
1. Does it make any sense to solve the structure at _this_ resolution? It is not completely novel protein, there are known structures with about 54 % identity to this. The fold is known to be the same.
2. Doing a molecular replacement with "Phaser" using EM-model gives me a unique solution. And I can see a reasonable electron density map. Tried again with AMORE - The amore-build-output model also gives the same solution. In both cases, the solutions are unique. There are no so-called translational symmetry.
3. If I do a Refmac restrained refinement, though I get R/Rfree in around 30.1/35.5 the stereochemistry is very poor (18 % outliers). I had to enable tight WEIGHT MATRIX (1e-7). Here at this resolution does it make any sense to a restrained refinement?
4. If I do only a rigid body with Refmac, the R-factor/Rfree are at around 41 %, and in many places model does not fit density. When I manually correct these and refine there is basically no change in R-factor/R-free (it even worsens in cases).
>From: Bernhard Rupp (Hofkristallrat a.D.) <[log in to unmask]>>Date: 27 April 2015 at 21:54
>Subject: Re: [ccp4bb] 3BDN, 16.5% Ramachandran Outliers!!!!!
>To: [log in to unmask]
>What we cannot tell sans supporting density is whether it is a more accurate model, although I have>rarely seen an improvement in geometry giving worse density fit. Usually a mess remains a mess -
>there is (at this resolution) no free lunch. The key question is again – does the model justify
5. I did try DEN, reference model restraints, PROSMART(refmac) etc which have no improvement at all - R/Rfree stuck at 41 %.
6. Since my group is EM-group, I wonder when EM-maps of 6 A are published, why are X-ray data at the same resolution not being published? What happens to these datasets?
7. Can I _just_ do a molecular replacement and just mutate residues (based on a sequence alignment - There are large numbers of deletions and hence sequence registers are different/unknown) and deposit it as a model in the PDB? Should I put the side-chains or it is meaning less at this resolution? Why in the EM-field they are allowed to deposit such coordinates with side-chains?
8. As [log in to unmask] points out
>Particularly in Molecular Replacement structures, and here particularly in those with multi->segment/domain models, there are almost always parts that fit well and others>that fit poorly - with simply not enough data at the given resolution to improve the poor parts>sans additional phase information. Bias issues have been discussed and need not be iterated here.
Since my protein also has multiple catalytic domains, some of them better resolved and others terrible. What about bias here at 6 A resolution? Very large problem, indeed.
Apologies for the long email, and any suggestion will be gratefully received.
---------- Forwarded message ----------
From: Bernhard Rupp (Hofkristallrat a.D.) <[log in to unmask]>
Date: 27 April 2015 at 21:54
Subject: Re: [ccp4bb] 3BDN, 16.5% Ramachandran Outliers!!!!!
To: [log in to unmask]
I’d be very careful at judging low resolution structures. This is a tricky businessrequiring a lot more info than just the PDB validation report. The 3+ to 4 Aresolution range is a particularly deceptive one: The crystallographer does not have much data given the model parameters (perhaps consulting his figure showingdeterminacy for coordinate refinement might help)http://www.ruppweb.org/Garland/gallery/Ch12/pages/Biomolecular_Crystallography_Fig_12-11.htm At this resolution one has about enough data to keep enthusiasm up but at the same time it isnot quite yet bad enough to throw up the hands and admit that that one is de facto modelling with a few X-ray restraints (i.e. data), requiring correspondingly suitable refinement protocols (and discipline,aka mental restraints in addition to stereochemical restraints). One is easily spoiled by looking exceptional 2A structures of huge complexes, butnature (I do not mean the journal but the same time would not exclude it) is often cruel. Particularly in Molecular Replacement structures, and here particularly in those with multi-segment/domain models, there are almost always parts that fit well and othersthat fit poorly - with simply not enough data at the given resolution to improve the poor parts sans additional phase information. Bias issues have been discussed and need not be iterated here. Pavel is correct in pointing out that a model with better geometry is also a more plausible model.What we cannot tell sans supporting density is whether it is a more accurate model, although I haverarely seen an improvement in geometry giving worse density fit. Usually a mess remains a mess - there is (at this resolution) no free lunch. The key question is again – does the model justify the specific conclusions drawn from it? If a poor model is better than no model at all, be it, as longas this is recognized and not used as an excuse for careless work. Facile dictu, difficile factu.