At 08:18 PM 3/8/2006, you wrote:
>Is there an easier way? Is there any program that I
>can simply load pdf files into and start analyzing
>them. I'm undertaking a project that looks at
>political magazines over the last 30 years and most
>everything before 1990 consists of scanned pdf files.
As you seem to know, there are two different kinds of PDF files.
Those scanned directly from the pages of magazines will be images.
There are programs that can extract the text from PDF text files
(those that are converted from a word processing file), but there is
no program that can directly extract the text from PDF image files.
There are however, "optical scanner" programs that are able to "scan"
PDF image files just like they would a page of printed text. The
accuracy is still not 100%, but it is pretty good.
I have an early optical scanner program that never worked very well
for anything other than a simple page of text, but the company that
made it still keeps sending emails to me with upgrade offers. They
are now called, I think, Nuance, or something like that.
Good luck.
Elliot Richmond, Ph.D.
Adjunct Professor of Astronomy
Austin Community College
Austin, Texas
--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.1.375 / Virus Database: 268.2.0/276 - Release Date: 3/7/2006
|