Print

Print


I've got a question I wonder if the list can help me with.

I'm looking for arguments for and against when, and if, digital material 
should be normalised. I'm thinking about the long term management of 
textual material in proprietary formats such as MS Word. I see three basic 
approaches on which I'm seeking the lists comments and thoughts.

The first approach normalises textual material at the point of ingestion, 
converting all incoming material to a neutral format such as XML 
immediately. This would create an open format manifestation with the aim 
of long term sustainable management.

The second approach would be one of 'wait and see', characterised by  
recognising that if a particular format isn't immediately 'at risk' of 
obsolescence why touch it until some form of migration becomes necessary 
at some future point. 

The third approach preserves the bitsteam as acquired and delivers it in 
an unmodified form upon request, ie MS Word in – MS Word out.

The first approach requires tools, resources and investment immediately. 
The second requires these same resources, and possibly more, in the 
future. The future requirements for the third approach are perhaps unknown 
aside from that of adequate technical metadata.

I'm interested in ideas about the sustainability of these approaches, the 
costs of one approach over the other and the perceived risks of moving 
material to an open format sooner rather than later. I'd be very 
interested in examples of projects which have taken either approach.

Any examples, comments or thoughts would be very much appreciated.

Thanks, dnt

Dave Thompson, Digital Curator, Wellcome Library, UK

[log in to unmask]