I've got a question I wonder if the list can help me with.
I'm looking for arguments for and against when, and if, digital material
should be normalised. I'm thinking about the long term management of
textual material in proprietary formats such as MS Word. I see three basic
approaches on which I'm seeking the lists comments and thoughts.
The first approach normalises textual material at the point of ingestion,
converting all incoming material to a neutral format such as XML
immediately. This would create an open format manifestation with the aim
of long term sustainable management.
The second approach would be one of 'wait and see', characterised by
recognising that if a particular format isn't immediately 'at risk' of
obsolescence why touch it until some form of migration becomes necessary
at some future point.
The third approach preserves the bitsteam as acquired and delivers it in
an unmodified form upon request, ie MS Word in – MS Word out.
The first approach requires tools, resources and investment immediately.
The second requires these same resources, and possibly more, in the
future. The future requirements for the third approach are perhaps unknown
aside from that of adequate technical metadata.
I'm interested in ideas about the sustainability of these approaches, the
costs of one approach over the other and the perceived risks of moving
material to an open format sooner rather than later. I'd be very
interested in examples of projects which have taken either approach.
Any examples, comments or thoughts would be very much appreciated.
Thanks, dnt
Dave Thompson, Digital Curator, Wellcome Library, UK
[log in to unmask]
|