Print

Print


Hi Ed,

I didn't really open a can of worms... Sorry if I did!

> But there are many things affecting R-factors not
> being reported.  Do I have to deposit the bulk solvent mask?  

Not that many and most of them are reported (at least those that may 
affect the R-factor "significantly"). The model structure factor is 
(well different programs may define it differently, but they normally 
report the corresponding parameters) (as used in phenix.refine):

Fmodel = scale_overall * exp(-h*U_overall*ht) * (Fcalc + k_sol * 
exp(-B_sol*s^2/4) * Fmask)

- anisotropic scale matrix (U_overall) is reported as "REMARK   3    B11 
:", ...;
- ATOM and ANISOU is enough to calculate Fcalc (the difference between 
scattering tables used: it1992, wk1995 or n_gaussian will not make 
visible difference) (the difference in algorithm FFT vs DIRECT used to 
compute Fcalc will not, normally, make difference too);
- Bulk solvent k_sol and B_sol are reported: "REMARK   3   K_SOL" or 
similar;
- Reproducing Fmask  you at least need "grid_step", "solvent radius" and 
"shrink truncation radius", which phenix.refine reports all:

REMARK   3  BULK SOLVENT MODELLING.
REMARK   3   METHOD USED        : FLAT BULK SOLVENT MODEL
REMARK   3   SOLVENT RADIUS     : 1.11
REMARK   3   SHRINKAGE RADIUS   : 0.90
REMARK   3   K_SOL              : 0.347
REMARK   3   B_SOL              : 29.392

So I think you have all (or most of) you need to reproduce the R-factor 
statistics IF PDB FILE IS COMPLETE.

> I think
> the fundamental question is why do you need to be able to reproduce the
> exact R-factors.  Perhaps depositing FC and PHIC along with FP and SIGFP
> will solve your problem?
>   

Given so little information and effort to do so, I don't see why not? 
It's just a simple formula that relies on a number of simple parameters 
that are easy to report and in fact they are reported in PDB file in 
most of cases. If you have a few atoms model, you could use a pen and 
calculator to do so -:)  It's a great and simplest validation tool: if 
you unable to reproduce the R-factors then something is wrong with the 
1) file, 2) structure or 3) your software. However, if you explicitly 
allow the R-factors to be not ("exactly") reproducible, then you 
immediately loose the way to address "1)-3)".

> Don't get me wrong - I think it's important that deposited structures
> provide complete information about the model.  But why are riding
> hydrogens so particularly important when reporting crystallization
> conditions is not mandatory?  Or bulk solvent parameters?  Or geometry
> restraints you used for the custom ligand (thus it's not in the standard
> libraries)?
>   

Absolutely agree! So, let's make a tiny step forward and keep whatever 
we can easily keep, and hope that in future more information will be 
preserved (such as complete foot-print of restraints used and so on).

> I did a quick (and dirty) survey of the PDB and found that less than 2%
> of structures report hydrogens.  

Well, years back people used to cut low resolution data at 5...8A until 
it became clear that it's a not so good idea. Nowadays, I don't think 
anyone will throw away low resolution data. Similar rationale with 
hydrogens. Let's follow the progress and not look back (telling myself) -:)

All the best!
Pavel.