I haven't tried any recent software. I digitized Rudolf Münsterberg's
_Die Beamtennamen auf den griechischen Münzen_ (which is mostly proper
names so spelling checking doesn't help) and had good recognition rates.
I used ABBYY Fine Reader 6.0 but any trainable OCR program can work with
my technique. I told the system to perform OCR without using the
built-in alphabets. A lot of training is needed for the first dozen
pages because NONE of the letters are recognized. Whenever a letter was
recognized I told FineReader that it was a new ligature and entered the
Beta Code characters for polytonic Greek. The system OCRed the text
directly into Beta Code which I was able to convert to Unicode easily
myself.
-Ed
Sabine Thuillier wrote:
> We are digitizing some 500 pages of the /Diccionario Griego-Español/.
> This task is quite difficult because the letters are small (7.5) and our
> text is a mix of Greek, Spanish and all kind of abbreviations, numbers,
> etc. After several trials of different softwares, we chose FineReader 9.
> Anagnostis seemed to be quite pointless for our work, because I can
> remember it was impossible to configure this program in order to
> recognize at the same time Ancient Greek and a modern language. After a
> long and tedious training of FineReader, we managed to obtain a
> recognition quite satisfactory (maybe we've reached some 85% of
> effectiveness for the Greek), but a manual revision remained necessary.
>
> Sabine Thuillier
|