Print

Print


medieval-religion: Scholarly discussions of medieval religion and culture

John Wickstrom wrote:
>
> My understanding is that if documents are 'photographed'  (which is
> what I think Optical Recognition software tends to do), then the
> texts cannot be searched. This seems to be the usual method for
> reproducing large PDF documents such as the Patrologia and Acta
> Sanctorum.

Rather than just 'photographing' the page image, what OCR (optical character 
recognition) software allows you to do is make a searchable version of the 
image.  You can go further, of course, as PDF is a typesetting format, and 
recreate the fonts and original appearance of the document, but in 
completely searchable form.  This is what was done with the Bloomsday (16 
June 1904) edition of the Dublin Evening Telegraph:

http://www.harenet.pwp.blueyonder.co.uk/splitpea/LastPink.pdf

The original idea had been to reprint the paper, but the originals were in 
too poor a condition.

John Briggs

**********************************************************************
To join the list, send the message: join medieval-religion YOUR NAME
to: [log in to unmask]
To send a message to the list, address it to:
[log in to unmask]
To leave the list, send the message: leave medieval-religion
to: [log in to unmask]
In order to report problems or to contact the list's owners, write to:
[log in to unmask]
For further information, visit our web site:
http://www.jiscmail.ac.uk/lists/medieval-religion.html