Print

Print


Hi Folks,

Many thanks for all of your comments - in keeping with the spirit of the BB
I have digested the responses below. Interestingly I suspect that the
responses to this question indicate the very wide range of resolution
limits of the data people work with!

Best wishes Graeme

===================================

Proposal 1:

10% reflections, max 2000

Proposal 2: from wiki:

http://strucbio.biologie.uni-konstanz.de/ccp4wiki/index.php/Test_set

including Randy Read "recipe":

So here's the recipe I would use, for what it's worth:
  <10000 reflections:        set aside 10%
   10000-20000 reflections:  set aside 1000 reflections
   20000-40000 reflections:  set aside 5%
  >40000 reflections:        set aside 2000 reflections

Proposal 3:

5% maximum 2-5k

Proposal 4:

3% minimum 1000

Proposal 5:

5-10% of reflections, minimum 1000

Proposal 6:

> 50 reflections per "bin" in order to get reliable ML parameter
estimation, ideally around 150 / bin.

Proposal 7:

If lots of reflections (i.e. 800K unique) around 1% selected - 5% would be
40k i.e. rather a lot. Referees question use of > 5k reflections as test
set.

Comment 1 in response to this:

Surely absolute # of test reflections is not relevant, percentage is.

============================

Approximate consensus (i.e. what I will look at doing in xia2) - probably
follow Randy Read recipe from ccp4wiki as this seems to (probably) satisfy
most of the criteria raised by everyone else.



On Tue, Jun 2, 2015 at 11:26 AM Graeme Winter <[log in to unmask]>
wrote:

> Hi Folks
>
> Had a vague comment handed my way that "xia2 assigns too many free
> reflections" - I have a feeling that by default it makes a free set of 5%
> which was OK back in the day (like I/sig(I) = 2 was OK) but maybe seems
> excessive now.
>
> This was particularly in the case of high resolution data where you have a
> lot of reflections, so 5% could be several thousand which would be more
> than you need to just check Rfree seems OK.
>
> Since I really don't know what is the right # reflections to assign to a
> free set thought I would ask here - what do you think? Essentially I need
> to assign a minimum %age or minimum # - the lower of the two presumably?
>
> Any comments welcome!
>
> Thanks & best wishes Graeme
>