Dear CCP4 Citizenry:
I’m worried about medium to long-term data storage and integrity. At the moment, our lab uses mostly HFS+ formatted filesystems on our disks, which is the OS X default. HFS+ always struck me as somewhat fragile, and resource forks at best are a (seemingly needless) headache, at least as far as crystallography datasets go. (True, you can do HFS-compression and losslessly shrink your images by a factor of 2, or shrink your ccp4 installation, but these are fairly minor advantages).
I read the CCP4 wiki page http://strucbio.biologie.uni-konstanz.de/ccp4wiki/index.php/Filesystems that summarizes some of the other options. From what I have read, there and elsewhere, it seems like zfs and btrfs might be significantly better alternatives to HFS+, but I really would like to get a sense of what others have experienced with these, or other, equally or more robust options. I don’t feel like I know enough to critically evaluate the information.
Anyone know what the NSA uses?
I recently created a de novo backup of some personal data on an external HFS+ drive (photos, movies, music, etc). I was very unpleasantly surprised to find several files had been silently corrupted. (In the case of a movie file, for example, the file would play but could not be copied. In another case, a music file would not copy, yet it had identical md5sum and sha1 checksums when compared to an uncorrupted redundant backup I had. I’m still puzzled by this, but it suggests the resource fork might be the source of the corruption, and, more worrisome still, that conventional checksums aren’t detecting some silently corrupted data, so I am not even sure if zfs self-healing would be the answer.)
Since we as a community are now encouraging primary X-ray diffraction images to be stored, I can only imagine the problem could be ubiquitous, and a discussion might be worth having. (I apologize if this has been addressed previously; I did search the archive.)
All the best,
Bill
William G. Scott
Director, Program in Biochemistry and Molecular Biology
Professor, Department of Chemistry and Biochemistry
and The Center for the Molecular Biology of RNA
University of California at Santa Cruz
Santa Cruz, California 95064
USA
|