medieval-religion: Scholarly discussions of medieval religion and culture
From: John Briggs <[log in to unmask]>
> Christopher Crockett wrote:
>> note that these PDFs are in (what i call) the "new" PDF format --very
clean scans which are fully searchable/pasteable --i.e., OCRed PDF.
> If they are indeed "pasteable", then what you are seeing is not the original
scan, but a re-creation (i.e. a re-typesetting from the OCRed scan) -along the
lines of the Dublin Evening Telegraph to which I gave a reference
recently. Which explains why they are "very clean", of course.
sorry, John, but i simply do NOT believe that the process here --or in the
Chartres Inventaires which i've mentioned twice before (see below)-- is one of
OCRing the original text (a difficult enough job, given the accuracy of the
evident results) and then "re-typesetting" these *whole* books to make them
appear just as the original did.
this idea is simply Absurd.
thousands and *thousands* of font enhancements would have to be made (by hand,
one at a time), and for *what* purpose??
nope, that idea is a Non-Starter, fella.
if one actually takes the trouble to look at (as opposed to jsut Pedantitating
Upon) an example of a book done by this "new" process (whateverthahell that
process might be), it is clear that
1) we are looking at the scan of the **original** book --spots on the pages
and manuscript notes are clearly visible: were these created via Photoshop and
inserted into the "re-typeset" book at the appropariate places??
and
b) "Re-typesetting" has *not* been done (Viday Supra, re enhancements)
anybody here believes *that* latter scenario, i got a nice Bridge in the Hills
of Southern Indianer you need to buy, immediately (and cheap).
here's the link to the volume "Série G, t. 1. Archives ecclésiastiques,
évêchés, chapitres, séminaires. [29.3 Mb]"
http://www.archinoe.net/cg28/ir_visu_instrument.php?id=195
download it and take a look.
then let me know your ball-park estimate of the number of hours it would take
to "re-typeset" it (never mind about recreating the ms additions, we'll just
blow those off).
my *guess* is that we're looking at some Process (probably developed by Adobe)
which scans books, OCRs them (with remarkable accuracy --not Perfect, but
pretty damned good) and creates PDFs of them, with the OCRed text imbedded in
the PDF file, making it searchable and pasteable.
i believe that a prototype of this Process is visible on books.google, where
you have the option of "View Plain Text," e.g.,
http://books.google.com/books?id=w-83AAAAMAAJ&pg=PA419&dq=antoine+lancelot&lr=&as_brr=0
what you get when you do that is an OCRed text, but one full of errors (the
number of errors depending upon the typography of the original book and the
clarity of the scan).
the "new" Process is much enhanced --astonishingly enhanced, actually.
do you know anything about this, Rochelle?
c
**********************************************************************
To join the list, send the message: join medieval-religion YOUR NAME
to: [log in to unmask]
To send a message to the list, address it to:
[log in to unmask]
To leave the list, send the message: leave medieval-religion
to: [log in to unmask]
In order to report problems or to contact the list's owners, write to:
[log in to unmask]
For further information, visit our web site:
http://www.jiscmail.ac.uk/lists/medieval-religion.html
|