In message <[log in to unmask]>, Alex Morrison
<[log in to unmask]> writes
>Our experience is that even museums that think they are not storing
>information in re-usable forms can find surprising potential re-use in
>existing publications. If the texts have passed through a word-processor,
>a page layout programme, or even if the printers have used a press
>controlled by digital input then the source material will generally be
>recoverable and reusable.
In my experience the museum often loses control over the machine-
readable version of the publication a long time before it is finalised.
It is quite normal for the draft text to go from word processor format
(within the museum) to a DTP package (at the typesetter's), and for the
bulk of the editorial corrections to be made to the DTP copy.
This poses two problems:
- the typesetter has no motivation to look after the source copy once
their job is done. It may well be deleted or just lost soon after
publication;
- even if the DTP copy is available, it is _much_ harder to extract
information from than a corresponding wp document. (There is a nice,
free, OmniMark script that does a good job of converting arbitrary RTF
documents to well-formed XML)
Richard.
PS For those of you thinking about going into SGML/XML, it might be of
interest that OmniMark Inc. just announced that their powerful
conversion software will henceforth be free. They plan to make their
money selling the new Integrated Development Environment for OmniMark -
and that is only $995.
Richard Light
SGML/XML and Museum Information Consultancy
[log in to unmask]
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|