Well, I was about to chip in with some comment about the importance of
diacritics, not only for transcribing but also, sometimes, for searching. But
in any case it is easy to search or index on a diacritics-free view: we have
some very simple XSLT that does this (index, not search), and I believe there
are Unicode implementation rules that make this very easy.
(Default sort order for Unicode, at least using Saxon or Cocoon, does not take
this into account, however.)
I would certainly see no need ever to *encode* the Greek without diacritics;
putting it back will not only be very difficult, it will be impossible in those
cases where diacritics disambiguate meaning (breathings, obviously, but also
accents, often).
This was, I suppose, part of the argument for encoding in Unicode Normalisation
Form D as standard, with decomposed diacritics rather than precomposed as in NF
C (or is it the other way around?)--NFD, like Beta Code, is much easier to
convert to unaccented Greek characters for searching. (But much harder to
display etc., hence XML requires NFC.) Again, something like Hugh's transcoder
can convert between the two on the fly, so a fix that requires one rather than
the other will still be workable.
Apologies for the long, rambling, email...
G
--
=======================================
Gabriel BODARD
Inscriptions of Aphrodisias
Centre for Computing in the Humanities
King's College London
Kay House
7, Arundel Street
London WC2R 3DX
Email: [log in to unmask]
Tel: +44 (0)20 78 48 13 88
Fax: +44 (0)20 78 48 29 80
=======================================
|