This survey and subsequent results may be of interest to list members.
-----Original Message-----
From: [log in to unmask] [mailto:[log in to unmask]]
Sent: 24 August 2011 15:12
Subject: Architectures and data storage of LTP systems
Dear Sir or Madam,
As a part of our work on the development of a national data storage
infrastructure in the Czech Republic, we are working on an extensive
worldwide comparison of commonly used architectures of data storage and
corresponding technical background of the Long Term Preservation (LTP)
systems. This study is held under the auspices of CESNET, Center Cerit-SC,
Institute of Computer Science at Masaryk University, and the Moravian
Library in Brno.
We would appreciate if you will kindly forward this e-mail to the staff in
your institution responsible for the technical foundations/basis of your
long term data preservation systems.
Please, could you be so kind as to answer as many as possible of the
following questions? The results of this "dialog survey" will be a part of
a report that will be publicly available around end of this year. We will
be pleased to send you an electronic or printed copy when the report is
finished.
The information we would like to gather includes but is not limited to:
1. What kind of systems do you use to ensure long-term data preservation?
2. Does your institution use any LTP system (Rosetta, Tessella,
Archivematica)? If not, are you planning to deploy some form of LTP system
in the future?
3. If yes, what technical solutions stays behind it (home developed, iRODS,
etc.)?
4. Do you use your LTP system directly for serving the user copies to the
public or is there any system for accessing the user copies in the middle
and your LTP stores only master copies? If the later option corresponds to
your situation, how often do you synchronize the content of your LTP with
the system for exposing the digital objects to the public? Is the
performance (throughput/access time) of the LTP system a key quality in
your infrastructure?
5. Do you have your LTP system certified as a trusted repository
(TRAC, NESTOR) or do you plan a certification?
6. What kinds of HW technologies do you use for storing the master copies
(disks, tapes, hybrid solutions, etc.)?
7. Would you prefer one geographic location where the actual data is stored
or some kind of more geographically distributed approach keeping in mind
risks of physically destroying the site, e.g., by a natural disaster?
8. What are the main pros and cons of your LTP infrastructure (rather HW
infrastructure questions than functional requirements of the LTP system)?
9. Is your LTP system OAIS (ISO 14721:2003) compliant? How much is this
important for your institution? How would you categorize this feature
("nice to have", "should have", "must have")?
10. Do you have a document that maps your system to OAIS? Do you have any
services/processes beyond OAIS? Do you miss some important
functions/processes of OAIS and why?
11. Had you done a similar study before you decided to use the LTP system?
If so, would it be possible to get its results?
12. Do you have any documents describing your solution at the technical
and/or architectural level?
13. What is the approximate number of objects already stored? And what are
the expected final (maximum) numbers?
14. And how are the objects (data and metadata) structured? In other words,
how is a periodical/monograph/map represented in the LTP? How does your API
for various data types look like? What type of identifiers is used?
15. What extent of in-house customization was needed? Was the system
delivered as an "out of box" vendor solution, did a contract include local
customization or were you the major architects and developers?
16. Have you tried any form of distributed data storage? Provided you have
tried some distributed data storage, how consistently is the application
layer of the LTP system separated from the lower distributed data layer?
17. Is it legally permitted to keep your data saved outside the
country/institution? If not, would you like to use some form of on-premise
or spread-over-a-few-institution distributed repository for a long-term
storage instead of cloud storage services like Amazon S3 or Google Storage?
18. Is there any centralized instance (registry of digitization) for
monitoring a digitization and subsequent or preventive deduplication of the
digitized data in your country?
19. Does your institution participate on exchanging the metadata with other
institutions through OAI-PMH or other protocols?
The results of our survey will be summarised in a publicly available
report.
Many thanks for your valuable time and please feel free to answer this email
with any further questions you have.
Yours faithfully,
Jiri Kremser
[log in to unmask]
|