Dear Julien,
On Thu, Mar 19, 2020 at 08:47:20AM +0000, Julien Cappèle wrote:
> Though I agree with you Clemens that raw images are amazing to work
> with as you can use any software you are confortable with, we cannot
> forget that depositing several TB of data for each lab would be bad
> for ecological reason.
Of course, there are ecological (carbon footprint) considerations -
and there are lots of papers and studies about that. I haven't looked
at any numbers, but maybe some points:
* A lot of data is already stored (e.g. at synchrotrons) and would
"only" needed to be made "visible" via a DOI (caveat: I realise
that there are huge technical issues with that)
* How does that energy consumption compare with the energy used to
perform the experiment in the first place?
* If by having that data available we can improve software and the
way experiments are done: wouldn't that potentially save energy in
hte long run (avoiding poor or unnecessary experiments in the first
place)?
* We are looking at a move to increase the number of raw image data
depositions for deposited PDB structures - not at a requirement to
deposit raw images for every PDB structure or even for every
dataset ever collected.
At the moment there are about 4500 image datasets available for
about 100000 PDB X-Ray structures, i.e. ~5%.
> And because detectors are always improving (thank you all!), size of
> data will increase exponentially.
True ... and some type of experiment can benefit from those larger,
faster and more numerous types of datasets - if done correctly.
> Could it be possible for a new/already existing software to store
> reflections (area, intensity from center to border, position x/y on
> the image, and information of the image) in a lightweight and text
> only file ? Possibly a new format to be used for integration ?
See my other reply: this all assumes that the initial processing step
caught all spots (and nothing else) on the 2D image correctly.
There have been all kind of initiatives about raw data deposition (in
no particular order)
https://www.iucr.org/resources/data/dddwg
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5331468/
https://www.sciencedaily.com/releases/2016/11/161108130045.htm
https://journals.iucr.org/d/issues/2016/11/00/yt5099/
https://onlinelibrary.wiley.com/iucr/doi/10.1107/S0909049513020724
http://scripts.iucr.org/cgi-bin/paper?S0907444908015540
https://scripts.iucr.org/cgi-bin/paper?dz5309
https://bl831.als.lbl.gov/~jamesh/lossy_compression/
So we've been there before. Let's see if we can't do at least
something for the clearly important structures and work right now -
and worry about some long-term impact later (having maybe learned
something along the way). Just because we could be doing something now
doesn't mean we will have to keep doing this in a 1-N years time,
right ;-)
Cheers
Clemens
########################################################################
To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB&A=1
|