At the BFI we're looking for recommendations for suppliers of paper document scanning and OCR services, for one quite small and one larger project. Can anyone recommend a provider they have used?
The smaller: to undertake scanning of all pages from an A4 book (for which the BFI holds rights), the pages consistently formatted and well printed. To deliver probably a TIFF and a PDF document per page, with filenaming following a pattern (based on the OCR output of page content, ideally).To apply OCR engines and, ideally, to apply that OCR using some page zoning or format condition rules, to generate structured data output from the OCR, to some extent.
The larger: to undertake scanning of a substantial set of documents, organised by folder, to deliver probably a TIFF and PDF document per page, with filenaming probably following the containing folder / page sequence model (eg Foldername_document1.pdf. / Foldername_document2.pdf, etc). No OCR outcome.
If you have any positive experience of suppliers, in particular with any structured metadata creation from page formatting / structure, please let us know!
All the best
Stephen, BFI
****************************************************************
website: http://museumscomputergroup.org.uk/
Twitter: http://www.twitter.com/ukmcg
Facebook: http://www.facebook.com/museumscomputergroup
[un]subscribe: http://museumscomputergroup.org.uk/email-list/
****************************************************************
|