medieval-religion: Scholarly discussions of medieval religion and culture
John Wickstrom wrote:
>
> My understanding is that if documents are 'photographed' (which is
> what I think Optical Recognition software tends to do), then the
> texts cannot be searched. This seems to be the usual method for
> reproducing large PDF documents such as the Patrologia and Acta
> Sanctorum.
Rather than just 'photographing' the page image, what OCR (optical character
recognition) software allows you to do is make a searchable version of the
image. You can go further, of course, as PDF is a typesetting format, and
recreate the fonts and original appearance of the document, but in
completely searchable form. This is what was done with the Bloomsday (16
June 1904) edition of the Dublin Evening Telegraph:
http://www.harenet.pwp.blueyonder.co.uk/splitpea/LastPink.pdf
The original idea had been to reprint the paper, but the originals were in
too poor a condition.
John Briggs
**********************************************************************
To join the list, send the message: join medieval-religion YOUR NAME
to: [log in to unmask]
To send a message to the list, address it to:
[log in to unmask]
To leave the list, send the message: leave medieval-religion
to: [log in to unmask]
In order to report problems or to contact the list's owners, write to:
[log in to unmask]
For further information, visit our web site:
http://www.jiscmail.ac.uk/lists/medieval-religion.html
|