Dear all, We are currently working on a digitised version of the Thesaurus Glossarum Emendatarum and the OCR performed by Tesseract / Abbyy FineReader has some issues because the text mixes Latin and Ancient Greek words. We are currently considering various options such as training the OCR tool but would like to know if anyone has faced similar problems and has suggestions to improve the OCR output. Regards, Suzanne Mpouli Laboratoire d'Histoire des Théories linguistiques Université Paris Diderot Paris 7 <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> Garanti sans virus. www.avast.com <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> ######################################################################## To unsubscribe from the DIGITALCLASSICIST list, click the following link: https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=DIGITALCLASSICIST&A=1