Dear all
You may recall that I posted a query last year regarding image compression for EMPIAR and I want to thank all who responded (Ali Punjani, Radostin Danev, Robert McLeod, Dimitry Tegunov, Craig Yoshioka, David Mastronarde and others) and provided valuable tips. We have now analysed a number of different compression packages and some results are attached to this email.
Just to restate the problem:
We are looking at ways of compressing the raw image data served from the EMPIAR archive. In general we are more interested in how much the files are compressed than how much time it takes to compress the images (within reason). For ease of use - users should be able to download and install the package without cost for decompression purposes and decompression times should be reasonable (< 15 sec per GB, ~ 1TB in 4 hours).
Results of tests:
In general we found that the 7zip and xz packages gave the best results overall in terms of compression with a slight advantage for xz. xz is often installed by default under Linux/MacOs whereas 7zip can be freely downloaded and installed. We would like to gauge opinion on the following questions:
1) If we were to routinely compress files, this would mean faster transfers but some time decompressing files, and perhaps some hassle in terms of finding/installing the software and finding the extra space to do the decompression etc. Do the advantages of faster transfers outweigh these problems for you as as a user of EMPIAR data?
2) Do you have any objections to the use of the xz compression package?
Many thanks and best wishes
Ardan Patwardhan
Coordinator - Cellular Structure & 3D Bioimaging
EMDB & EMPIAR
European Bioinformatics Institute (EMBL-EBI) European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Hinxton, Cambridge CB10 1SD
Tel: +44 1223 492649
|