medieval-religion: Scholarly discussions of medieval religion and culture
From: rochelle altman <[log in to unmask]>
> At 04:26 PM 4/30/2008, Chris C wrote:
>>>(And it only works on typescript, not hand lettering.)
>>actually, a good OCR program *can* be "trained" to read manuscript,
providing only that the "lettering" is *consistent* [which most middlevil
scribal product are].
> Yes, you can train an OCR -- with Latin and Greek bookhands. But, no,
forget most handwriting.
rite.
the operative word in my statement was definitely "consistent"
>Back in 1992-93 I wrote a program to OCR Insular... called it "Filbert"
because everyone thought I was a nut to even try. Never got it beyond 75%
accuracy and this handwriting problem has never been solved. .. though some
members of IGS are still working on it. All you have to do is squish letters
together or draw a line through the writing and it's unreadable by OCR no
matter how much training.
yes, OCR working on the very principle of "consistency"
it's a No-Brainer and, as the noted Sage of the Sixties, R. Crumb, famously
said, "It's all just Lines on Paper, folks."
any deviation from the pattern which the softwhere has been written/trained to
recognize as a given letter and the dumb program has no option but to just
offer a dumb guess.
OCR has come a long, long way since i first started using it on a dedicated,
$40,000 Kurzweil machine in the '80s, but i'm not sure whether the basic
algorithm has changed all that much --it's just gotten more sophisticated.
> To be honest, I don't think the problem will ever be solved.
not using essentially dumb, mechanistic algorithms, surely.
but, even fairly dumb human beinks can do it (or be trained to do it), so it
follows that if we could design programs which mimic those aspects of minimal
intelligence which enable us to decipher all these infinitely variable
squiggly Lines on Paper, then we'd be Home Free.
> Pdf is NOT downward compatible. The one major plus of pdf is it protects the
text from added or changed wording that the author did not put there.
mmmm.... yes, the Original can't be changed, but notes can be added.
unless the document is "secured" to disable this feature.
> (Thank you MS for making so much downwards incompatible. Even their
new Vista is not fully compatible with their own XP Word.)
well, one of the main things PDF has going for it, far as i can see, is the
fact that it was not developed by MS.
>>the training feature is there to enable the softwhere to recognize unusual
fonts or oddities like the [consistently] broken type which one sees
ocassionally in older books, but basically any consistent "little dots on
paper" can be recognized, providing the user is possessed of near-Jobean
patience.
> Hmm, why not the patience of Griselda? :-)
because some people might not know who she is and confuse her with, say,
http://www.google.com/search?hl=en&q=Griselda
which would not be appropriate for a Family List.
c
**********************************************************************
To join the list, send the message: join medieval-religion YOUR NAME
to: [log in to unmask]
To send a message to the list, address it to:
[log in to unmask]
To leave the list, send the message: leave medieval-religion
to: [log in to unmask]
In order to report problems or to contact the list's owners, write to:
[log in to unmask]
For further information, visit our web site:
http://www.jiscmail.ac.uk/lists/medieval-religion.html
|