I've been following the discussions on scanning methodologies with great
interest. There was a conference sponsored by the National Preservation
Office and the Research Libraries Group on this very topic. (Papers are
available at http://www.rlg.org/preserv/joint/) During this conference I
heard the phrase "fit for purpose" for the first time (by Jane Williams),
and I thought that it nicely encapsulated volumes of advice about how to
scan historic materials. Jane's adage reminded me, however, that there is
absolutely nothing wrong with scanning at settings lower than the best that
technology can produce provided that one is very sure that the full range
of long-term requirements will be met with the specification (for bit
depth, resolution, etc.) of choice. As other contributors have pointed
out, scanning for the web (i.e. to present legible images on screen)
requires a lower level of quality than scanning for high-quality print or
for OCR.
Before moving away from this topic, I wanted to add the consideration of
document handling, as that decision can (and will) dictate the options that
will actually be available to a manager defining project requirements.
Since this thread originated with a question about book page scanners, it
is worth pointing out that scanning requirements that are fit for purpose
-- e.g., 600 dpi 1-bit TIFFs to meet requirements ranging from on-screen
images to OCR -- are sometimes in direct conflict with handling
requirements for source materials. What it boils down to is that if you
must retain bindings when scanning books -- disbinding and reassembling is
another option -- the options for resolution, image enhancements, and other
production "requirements" will decrease and your costs will increase.
Experience has been a good teacher at a number of institutions that if,
particularly for your first or second project, you select materials that
can be turned from books into "book pages," it will be much easier to
produce high-quality TIFFs (where quality is fit for purpose) at a
reasonable price.
Steve Chapman
At 09:10 AM 1/10/00 -0700, you wrote:
>
>
>I would agree with what both Mark and Bill have mentioned in terms of
>archival images and future usefulness. However, the primary question (in my
>mind) is what is useful now. Having a higher than screen resolution now
>affords some additional return on investment on the longevity of the files
>in that a "simple jpeg" now (bear in mind that file format and resolutions
>have no correspondence; one can have a 300 dpi, for print jpeg or a 72 dpi,
>for display TIFF) may not be sufficient in 18 months time where it is
>conceivable that scanning at a somewhat high resolution may have utility for
>3-4 years.
>
>Another consideration is the utility of the files in contexts other than
>display. OCR, for instance, can utilize the higher resolution to return
>better results -- my experiences with OCR and 72 dpi for display images have
>been rather poor whereas the higher (300 dpi) resolution images OCR much
>better.
>
>A final consideration is the actual effort of scanning. While larger file
>sizes do require additional storage space and handling issues, much of the
>actual effort involved in scanning is the initial image acquisition. By
>using the highest useable resolution available in a non-lossy format, it
>allows going back and converting the image for other purposes as things
>evolve. A corollary to this is that many of these projects may be one-time
>events from a funding standpoint and that using the equipment and resources
>to their maximum advantage now is merely being judicious in the use of
>resources.
>
>Tim
>
>--------
>Tim Au Yeung
>Manager of Digitization Initiatives
>Information Resources (Press)
>University of Calgary
>voice: 403.220.8975
>email: ytau (at) ucalgary.ca
>
>
>----- Original Message -----
>From: Mark Conrad <[log in to unmask]>
>To: <[log in to unmask]>; <[log in to unmask]>
>Sent: Monday, January 10, 2000 7:19 AM
>Subject: Re: Greetings, and question on Scanned Images
>
>
>>
>>
>> Just to amplify a little bit on what Bill has said... The problem is more
>basic than scanning at higher resolutions at a later date. There is no such
>thing as an archival digital image. If images are to be kept for more than a
>few years they will have to migrated to new hardware and software as the
>technology rapidly turns over or some other way of providing access to the
>then-obsolete formats will have to be secured. The technology to read a
>particular TIFF file may be around a little longer than most because of the
>widespread use of TIFF, but it is highly unlikely that the technology to
>read TIFF 4 or 6 images will be commonly available in 5-10 years.
>>
>> One possible way around this problem is to film the materials at the same
>time that you scan them. Assuming the film is shot to produce an image with
>a qi of 8, processed to archival specifications and stored under proper
>environmental conditions, it will be around for 100 or more years. The film
>could then be re-scanned at a later date to take advantage of the current
>digital technology.
>>
>> Long-term preservation of and access to software-dependent objects is the
>topic of a great deal of research right now. See for example:
>>
>> www.sdsc.edu/nara
>>
>> www.interpares.org
>>
>> info.wgbh.org/upf/
>>
>> Just my two cents from an archival point of view. These views are my own,
>not necessarily my employer's.
>>
>> Mark Conrad
>> Director for Technology Initiatives
>> National Historical Publications and Records Commission (NHPRC)
>> National Archives and Records Administration
>> Room 111
>> 700 Pennsylvania Avenue, NW,
>> Washington, DC 20408-0001
>> phone: 202-501-5600 ext. 233
>> fax: 202-501-5601
>> e-mail: [log in to unmask]
>>
>>
>> >>> Bill Barrow <[log in to unmask]> 01/10/00 03:08AM >>>
>>
>> My concern here is that we are making "archival" TIF files at, say, 300
>> dpi, when significantly more information than that is available on the
>> original photographic print. I expect that all these digital archival
>> files will need to be re-scanned someday soon, when hundred-gig files
>> aren't a big deal to create, transport, store and use. I can wait, but
>are
>> the folks investing big sums of money creating these archival files today
>> really aware that their work is only temporary? Are we encouraging
>> expensive scanning projects when, for now, perhaps only a simple jpeg is
>> necessary?
>>
>>
>> >>At the Huntington Archive where we do almost excliusively images ranter
>> >>that text documents, we have found the file size a particularly
>difficult
>> >>problesm which we are still discussing. Archival storage of images, has
>> >>for us been at the maximum "usable" resolution, and not the maximum
>> >>possible resolution. With scanners that go up to 14,000dpi and true
>5,600
>> >>dpi, hundred Gb files are possible but who wants all that data from a
>> >>single sheet of paper or film?
>>
>> >If you have a large object and want high quality reproduction -
>especially
>> >of details - you have to go beyond 300dpi sometimes - and certainly file
>> >sizes grow with object size...
>>
>> Bill Barrow
>>
>>
>>
>> * * * * * * * * * * * * * * * * * * * * *
>> WILLIAM C. BARROW
>> 13537 Cedar Road
>> University Heights, OH 44118
>> (216) 397-8327
>> [log in to unmask]
>> http://www.csuohio.edu/CUT/wcb2.htm
>> - - - - - - - - - - - - - - - - - - - - -
>> Special Collections Librarian
>> Cleveland State University Library
>> 1860 East 22nd St., #201
>> Cleveland, OH 44114-4435
>> (216) 687-6998 - office
>> (216) 687-9380 - fax
>> - - - - - - - - - - - - - - - - - - - - -
>> THE CLEVELAND DIGITAL LIBRARY
>> http://web.ulib.csuohio.edu/SpecColl/cdl/
>> * * * * * * * * * * * * * * * * * * * * *
>>
>>
>>
>
Stephen Chapman
Preservation Librarian for Digital Initiatives
Preservation Center, Harvard University Library
Holyoke Center 821
Cambridge, MA 02138
Phone... 617-495-8596
Fax....... 617-496-8344
E-mail.. [log in to unmask]
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|