Hi,

Any time you do a thought experiment you make a fake-data data set, the
"true" phases and "true" amplitudes become the ones you put into the
simulation process. This is by definition. Is there potential for
circular reasoning? Of course! But you can do controls:

this is so much true! This is what I've been doing for development and testing all the time since started working for Phenix! Fully controlled thought/numerical experiments done this way are super-helpful (but obviously have limitations!).

If you start with an ordinary single-conformer coordinate model and
flat bulk solvent from refmac to make your Ftrue, then what you will
find is that even after adding all plausible experimental errors to the
data the final Rwork/Rfree invariably drop to small-molecule levels of
3-4%. This is true even if you prune the structure back, shake it, and
rebuild it in various ways. The difference features always guide you
back to Rwork/Rfree = 3/4%. However, if you refine with phenix.refine,
you will find Rwork/Rfree stall at around 10-11%. This is because Ftrue
came from refmac and refmac and phenix.refine have somewhat different
bulk solvent models. If Ftrue comes from phenix and you refine with
refmac you get similar "high" R values. High for a small molecule
anyway. And, of course, if you get Ftrue from phenix and refine with
phenix you also get final Rwork/Rfree = 3/4%. If you do more things that
automated building doesn't do, like multi-headed side chains, or get the
bulk solvent from an MD simulation, then you can get "realistic"
Rwork/Rfree in the 20%s. All of this is the main conclusion from this
paper: https://dx.doi.org/10.1111/febs.12922

Even within Phenix alone this is true if you switch between different scaling/bulk-solvent models or play with automation levels (such as ignoring reflection otliers, etc).

Pavel