Hi Ed,
I didn't really open a can of worms... Sorry if I did!
> But there are many things affecting R-factors not
> being reported. Do I have to deposit the bulk solvent mask?
Not that many and most of them are reported (at least those that may
affect the R-factor "significantly"). The model structure factor is
(well different programs may define it differently, but they normally
report the corresponding parameters) (as used in phenix.refine):
Fmodel = scale_overall * exp(-h*U_overall*ht) * (Fcalc + k_sol *
exp(-B_sol*s^2/4) * Fmask)
- anisotropic scale matrix (U_overall) is reported as "REMARK 3 B11
:", ...;
- ATOM and ANISOU is enough to calculate Fcalc (the difference between
scattering tables used: it1992, wk1995 or n_gaussian will not make
visible difference) (the difference in algorithm FFT vs DIRECT used to
compute Fcalc will not, normally, make difference too);
- Bulk solvent k_sol and B_sol are reported: "REMARK 3 K_SOL" or
similar;
- Reproducing Fmask you at least need "grid_step", "solvent radius" and
"shrink truncation radius", which phenix.refine reports all:
REMARK 3 BULK SOLVENT MODELLING.
REMARK 3 METHOD USED : FLAT BULK SOLVENT MODEL
REMARK 3 SOLVENT RADIUS : 1.11
REMARK 3 SHRINKAGE RADIUS : 0.90
REMARK 3 K_SOL : 0.347
REMARK 3 B_SOL : 29.392
So I think you have all (or most of) you need to reproduce the R-factor
statistics IF PDB FILE IS COMPLETE.
> I think
> the fundamental question is why do you need to be able to reproduce the
> exact R-factors. Perhaps depositing FC and PHIC along with FP and SIGFP
> will solve your problem?
>
Given so little information and effort to do so, I don't see why not?
It's just a simple formula that relies on a number of simple parameters
that are easy to report and in fact they are reported in PDB file in
most of cases. If you have a few atoms model, you could use a pen and
calculator to do so -:) It's a great and simplest validation tool: if
you unable to reproduce the R-factors then something is wrong with the
1) file, 2) structure or 3) your software. However, if you explicitly
allow the R-factors to be not ("exactly") reproducible, then you
immediately loose the way to address "1)-3)".
> Don't get me wrong - I think it's important that deposited structures
> provide complete information about the model. But why are riding
> hydrogens so particularly important when reporting crystallization
> conditions is not mandatory? Or bulk solvent parameters? Or geometry
> restraints you used for the custom ligand (thus it's not in the standard
> libraries)?
>
Absolutely agree! So, let's make a tiny step forward and keep whatever
we can easily keep, and hope that in future more information will be
preserved (such as complete foot-print of restraints used and so on).
> I did a quick (and dirty) survey of the PDB and found that less than 2%
> of structures report hydrogens.
Well, years back people used to cut low resolution data at 5...8A until
it became clear that it's a not so good idea. Nowadays, I don't think
anyone will throw away low resolution data. Similar rationale with
hydrogens. Let's follow the progress and not look back (telling myself) -:)
All the best!
Pavel.
|