Hi Folks,
We have two problems here, which are orthogonal and should probably
not be muddled. The first is the archival / making available of
diffraction images. Although this is (in theory) currently possible it
has not been enforced so is variable. I have however found that when
someone has an interesting data set, and I ask for it, they can
usually dig it out.
The second problem is the single format to rule them all. At the
moment we store data in whatever format the detector was minded to use
in recording. This is actually fine, as the data reduction programs we
all use will work with them, provided that the images are corrected.
The point about the associated documentation is far more important.
However, an analysis performed with just the information provided in
the paper, assuming that the contents of the image headers is
somewhere near correct, helps to independently verify the results. If
they are not correct, the values used should be included.
These are both valuable discussions, but should (IMnsHO) be carried
out separately so as to avoid the one causing problems for the other.
To a large extent, simply recovering the images from a DAT and finding
somewhere to put them appears to be the biggest problem - I found that
having an FTP incoming available was often the one thing which made
this possible. Therefore, having a central repository would be
excellent. I have a couple of comments about this too...
At the moment you can pretty much fit the raw data stored in the pdb
on some DVD's or a firewire disk or something - not the derived
tables, which I expect are huge, but the source files are fairly
small. This means that backing them up is tractable, and some quality
of service can be provided.
When you back up your data to say a firewire disk, and it fails, you
can just take the hit and not worry about it. If a service takes
responsibility for the data it must be curated, backed up ideally to
multiple locations, be available, provide sufficient space / bandwidth
etc. Much more expensive, much more complicated. You then also get the
problem previously mentioned of ensuring that these images are from
*this* pdb not the mutant you are working on, which has much the same
cell and symmetry. Now that's hard!
Just because something is hard does not mean that it should not be
done, but this can't be done ad hoc - if it is going to be done and be
useful, it would have to be done properly. I'd be delighted to see it
happen mind.
Cheers,
Graeme
2009/3/16 Eleanor Dodson <[log in to unmask]>:
> The deposition of images would be possible providing some consistent
> imagecif format was agreed.
> This would of course be of great use to developers for certain pathological
> cases, but not I suspect much value to the user community - I down load
> structure factors all the time for test purposes but I probably would not
> bother to go through the data processing, and unless there were extensive
> notes associated with each set of images I suspect it would be hard to
> reproduce sensible results.
>
> The research council policy in the UK is that original data is meant to be
> archived for publicly funded projects. Maybe someone should test the reality
> of this by asking the PI for the data sets?
> Eleanor
>
>
> Garib Murshudov wrote:
>>
>> Dear Gerard and all MX crystallographers
>>
>> As I see there are two problems.
>> 1) Minor problem: Sanity, semantic and other checks for currently
>> available data. It should not be difficult to do. Things like I/sigma, some
>> statistical analysis expected vs "observed" statistical behaviour should
>> sort out many of these problems (Eleanor mentioned some and they can be
>> used). I do not think that depositors should be blamed for mistakes. They
>> are doing their best to produce and deposit. There should be a proper
>> mechanism to reduce the number of mistakes.
>> You should agree that situation is now much better than few years.
>>
>> 2) A fundamental problem: What are observed data? I agree with you
>> (Gerard) that images are only true observations. All others (intensities,
>> amplitudes etc) have undergone some processing using some assumptions and
>> they cannot be considered as true observations. The dataprocessing is
>> irreversible process. I hope your effort will be supported by community. I
>> personally get excited with the idea that images may be available. There are
>> exciting possibilities. For example modular crystals, OD, twin in general,
>> space group uncertaintly cannot be truly modeled without images (it does not
>> mean refinement against images). Radiation damage is another example where
>> after processing and merging information is lost and cannot be recovered
>> fully. You can extend the list where images would be very helpful.
>>
>> I do not know any reason (apart from technical one - size of files) why
>> images should not be deposited and archived. I think this problem is very
>> important.
>>
>> regards
>> Garib
>>
>>
>> On 12 Mar 2009, at 14:03, Gerard Bricogne wrote:
>>
>>> Dear Eleanor,
>>>
>>> That is a useful suggestion, but in the case of 3ftt it would not have
>>> helped: the amplitudes would have looked as healthy as can be (they were
>>> calculated!), and it was the associated Sigmas that had absurd values,
>>> being
>>> in fact phases in degrees. A sanity check on some (recalculated) I/sig(I)
>>> statistics could have detected that something was fishy.
>>>
>>> Looking forward to the archiving of the REAL data ... i.e. the images.
>>> Using any other form of "data" is like having to eat out of someone
>>> else's
>>> dirty plate!
>>>
>>>
>>> With best wishes,
>>>
>>> Gerard.
>>>
>>> --
>>> On Thu, Mar 12, 2009 at 09:22:26AM +0000, Eleanor Dodson wrote:
>>>>
>>>> It would be possible for the deposition sites to run a few simple tests
>>>> to
>>>> at least find cases where intensities are labelled as amplitudes or vice
>>>> versa - the truncate plots of moments and cumulative intensities at
>>>> least
>>>> would show something was wrong.
>>>>
>>>> Eleanor
>>>>
>>>
>>>
>>> --
>>>
>>> ===============================================================
>>> * *
>>> * Gerard Bricogne [log in to unmask] *
>>> * *
>>> * Global Phasing Ltd. *
>>> * Sheraton House, Castle Park Tel: +44-(0)1223-353033 *
>>> * Cambridge CB3 0AX, UK Fax: +44-(0)1223-366889 *
>>> * *
>>> ===============================================================
>>>
>>
>>
>
|