Hmm - I think I miscalculated, by a factor of 100 even!... need more
coffee. In any case, I still think it would be doable. Best - MM
On Aug 16, 2007, at 9:30 AM, Mischa Machius wrote:
> I don't think archiving images would be that expensive. For one, I
> have found that most formats can be compressed quite substantially
> using simple, standard procedures like bzip2. If optimized, raw
> images won't take up that much space. Also, initially, only those
> images that have been used to obtain phases and to refine finally
> deposited structures could be archived. If the average structure
> takes up 20GB of space, 5,000 structures would be 1TB, which fits
> on a single hard drive for less than $400. If the community thinks
> this is a worthwhile endeavor, money should be available from
> granting agencies to establish a central repository (e.g., at the
> RCSB). Imagine what could be done with as little as $50,000. For
> large detectors, binning could be used, but giving current hard
> drive prices and future developments, that won't be necessary. Best
> - MM
>
>
> On Aug 16, 2007, at 9:13 AM, Phil Evans wrote:
>
>> What do you count as raw data? Rawest are the images - everything
>> beyond that is modellling - but archiving images is _expensive_!
>> Unmerged intensities are probably more manageable
>>
>> Phil
>>
>>
>> On 16 Aug 2007, at 15:05, Ashley Buckle wrote:
>>
>>> Dear Randy
>>>
>>> These are very valid points, and I'm so glad you've taken the
>>> important step of initiating this. For now I'd like to respond to
>>> one of them, as it concerns something I and colleagues in
>>> Australia are doing:
>>>>
>>>> The more information that is available, the easier it will be to
>>>> detect fabrication (because it is harder to make up more
>>>> information convincingly). For instance, if the diffraction data
>>>> are deposited, we can check for consistency with the known
>>>> properties of real macromolecular crystals, e.g. that they
>>>> contain disordered solvent and not vacuum. As Tassos Perrakis
>>>> has discovered, there are characteristic ways in which the
>>>> standard deviations depend on the intensities and the
>>>> resolution. If unmerged data are deposited, there will probably
>>>> be evidence of radiation damage, weak effects from intrinsic
>>>> anomalous scatterers, etc. Raw images are probably even harder
>>>> to simulate convincingly.
>>>
>>> After the recent Science retractions we realised that its about
>>> time raw data was made available. So, we have set about creating
>>> the necessary IT and software to do this for our diffraction
>>> data, and are encouraging Australian colleagues to do the same.
>>> We are about a week away from launching a web-accessible
>>> repository for our recently published (eg deposited in PDB) data,
>>> and this should coincide with an upcoming publication describing
>>> a new structure from our labs. The aim is that publication occurs
>>> simultaneously with release in PDB as well as raw diffraction
>>> data on our website. We hope to house as much of our data as
>>> possible, as well as data from other Australian labs, but
>>> obviously the potential dataset will be huge, so we are trying to
>>> develop, and make available freely to the community, software
>>> tools that allow others to easily setup their own repositories.
>>> After brief discussion with PDB the plan is that PDB include
>>> links from coordinates/SF's to the raw data using a simple handle
>>> that can be incorporated into a URL. We would hope that we can
>>> convince the journals that raw data must be made available at the
>>> time of publication, in the same way as coordinates and structure
>>> factors. Of course, we realise that there will be many hurdles
>>> along the way but we are convinced that simply making the raw
>>> data available ASAP is a 'good thing'.
>>>
>>> We are happy to share more details of our IT plans with the
>>> CCP4BB, such that they can be improved, and look forward to
>>> hearing feedback
>>>
>>> cheers
>
>
> ----------------------------------------------------------------------
> ----------
> Mischa Machius, PhD
> Associate Professor
> UT Southwestern Medical Center at Dallas
> 5323 Harry Hines Blvd.; ND10.214A
> Dallas, TX 75390-8816; U.S.A.
> Tel: +1 214 645 6381
> Fax: +1 214 645 6353
------------------------------------------------------------------------
--------
Mischa Machius, PhD
Associate Professor
UT Southwestern Medical Center at Dallas
5323 Harry Hines Blvd.; ND10.214A
Dallas, TX 75390-8816; U.S.A.
Tel: +1 214 645 6381
Fax: +1 214 645 6353
|