Thanks Ed. That looks interesting, I might come along if I can find some spare holiday.
I see that the IMPACT web site mentions a "A full web-based collaborative correction system: this web-based platform, suitable for massive volunteer participation, validates and corrects OCR results. In this way, it enables the general public to help with large scale digitisation efforts" as one of the tools it is developing. Do you know when this is likely to be available? (and where from?).
Trevor Reynolds
Collections Registrar, English Heritage
tel: +44 (0) 1904 601905. 37 Tanner Row, York, YO1 6WP
-----Original Message-----
From: Museums Computer Group [mailto:[log in to unmask]] On Behalf Of Ed I Bremner
Sent: 10 August 2011 10:24
To: REYNOLDS, Trevor
Subject: Re: Software for digitising magazines & IMPACT Conference
Dear All,
MCG members interested in the cutting edge of OCR and the digitisation of historic text (including magazines), may well want to consider coming to the IMPACT Conference at the British Library on the 24-25th of October 2011.
This event will showcase the results from the IMPACT project and launch the IMPACT Centre of Competence.
IMPACT is a European project that has been developing new tools to improve the mass digitisation and OCR of historic text -
See: http://www.impact-project.eu/
Details of the conference are below, with a full programme at:
http://www.impact-project.eu/news/ic2011/conference-programme/
*********************************************************
With this email we would like to invite you to the final conference of the IMPACT project, "Digitisation & OCR: Better, faster, cheaper. Solutions of the IMPACT Centre of Competence and future challenges" that will take place on 24-25 October 2011 at the British Library in London. At this conference IMPACT will present the final project results, along with related research in the field of OCR and language technology.
This event will also mark the official launch of the IMPACT Centre of Competence. This Centre is focused on making digitisation of historical printed text in Europe better, faster, cheaper by sharing expertise and providing access to tools for all parts of the digitisation workflow, as well as tools, services and facilities for further advancement of the State of the Art in this field.
The programme for the conference is now online on the conference webpage, highlights include:
. Khalil Rouhana (European Commission - Director for digital content
and cognitive systems in DG Information Society and Media): "The EC Digital Agenda and official launch of the IMPACT Centre of Competence"
. Michael Fuchs (ABBYY Europe): "ABBYY FineReader: IMPACT
improvements"
. Paul Fogel (California Digital Library): "Experiences in mass
digitisation: examining OCR quality"
. Clemens Neudecker (National library of the Netherlands): "The IMPACT
Framework and what you can do with it"
. Asaf Tzadok (IBM Haifa Research Lab): "IBM Adaptive OCR engine and
CONCERT Cooperative Correction"
. Majlis Bremer-Laamanen (National Library of Finland): "Crowdsourcing
for OCR correction: Experiences with Digitalkoot"
. Katrien Depuydt (INL ) and Klaus Schulz (University of Munich):
"Language work in IMPACT"
. Stephen Krauwer (CLARIN coordinator, University of Utrecht):
"Related language work in CLARIN"
. Parallel sessions on State of the art research tools for document
analysis and OCR, IMPACT language tools & resources and Digitisation tips (Meet the expert).
More programme updates will be announced through http://www.impact-project.eu/news/ic2011/conference-programme/ and Twitter
(hashtag: #impactconf2011). Registration is now possible at the regular fee of 120 GBP. To register, please go to this BL ticket website and click October. More information is also available from the attached flyer.
*********************************************************************
Best Wishes
Ed Bremner - IMPACT Project
UKOLN
[log in to unmask]
SKYPE: ed.bremner
******************************
Ed I Bremner
Consultant and Trainer in Digital Media
BremWeb Imaging
www.bremweb.co.uk
[log in to unmask]
07973 335509
******************************
-----Original Message-----
From: Museums Computer Group [mailto:[log in to unmask]] On Behalf Of Adam Waterton
Sent: 10 August 2011 09:41
To: [log in to unmask]
Subject: Re: Software for digitising magazines
Hi Trevor,
We recently undertook a project to digitise and create machine readable versions of a series of Royal Academy of Arts exhibition catalogues (1870-1913). We tried a few OCR packages and also found that Abbyy Finereader http://finereader.abbyy.com/ gave good results. However, the resulting text files were still very inaccurate and required an enormous amount of manual tidying up to make them accurate enough for consistent searching. Also, Abbyy is not cheap and the costs will mount up if you need a separate Abbyy licence for each of your volunteers.
The results of our digitisation project can be seen here:
http://www.racollection.org.uk/ixbin/indexplus?_IXACTION_=file&_IXFILE_=temp
lates/pages/exhibition_list.html
Regards,
Adam.
Adam Waterton
Head of Library Services
Royal Academy of Arts
Burlington House
Piccadilly
London
W1V 0DS
T: 020 7300 5740 | F: 020 7300 5765 | E: [log in to unmask]
The Royal Academy of Arts Collection Online: www.racollection.org.uk
-----Original Message-----
From: Museums Computer Group [mailto:[log in to unmask]] On Behalf Of Howell, Alan
Sent: 09 August 2011 10:09
To: [log in to unmask]
Subject: Re: Software for digitising magazines
Hi Trevor
I have used Abbey Finereader for some projects at home and found it to be very effective at this sort of thing.
Kind regards
Alan Howell
Guernsey Museums & Galleries
SSDDI +44 (0) 1481 709736
-----Original Message-----
From: Museums Computer Group [mailto:[log in to unmask]] On Behalf Of REYNOLDS, Trevor
Sent: 06 August 2011 09:49
To: [log in to unmask]
Subject: Software for digitising magazines
Dear all
A volunteer run charity I'm involved with wants to digitise the back issues of its periodicals.
What they want to end up with is PDF/A format documents with a scanned image of each page with searchable text underneath the image. Many of the early issues have poor quality text and any OCRed text will probably need heavy editing.
Can you recommend software which will enable this to be done? They are intending to split the work between a number of volunteers who will be working at home on their own computers so low cost, easy to use solutions would be welcome!
Trevor Reynolds
Collections Registrar, English Heritage
37 Tanner Row, York, YO1 6WP tel: 01904 601905
Portico: your gateway to information on sites in the National Heritage Collection; have a look and tell us what you think.
http://www.english-heritage.org.uk/professional/archives-and-collections/por
tico/
****************************************************************
website: http://museumscomputergroup.org.uk/
Twitter: http://www.twitter.com/ukmcg
Facebook: http://www.facebook.com/museumscomputergroup
[un]subscribe: http://museumscomputergroup.org.uk/email-list/
****************************************************************
This e-mail (including attachments) may contain sensitive and/or privileged information. If received in error, its use by you is not authorised and may be unlawful. Please notify the sender and delete all copies immediately.
E-mails may be subject to error, interference and virus and no liability is accepted for loss or damage however it arises and whether direct or indirect. Service of legal proceedings by e-mail may not be accepted.
E-mails may be monitored for compliance purposes. All documents are subject to copyright.
****************************************************************
website: http://museumscomputergroup.org.uk/
Twitter: http://www.twitter.com/ukmcg
Facebook: http://www.facebook.com/museumscomputergroup
[un]subscribe: http://museumscomputergroup.org.uk/email-list/
****************************************************************
The Royal Academy of Arts is a registered charity under Registered Charity Number 1125383 and is also registered as a company limited by guarantee in England and Wales under Company Number 6298947. Registered office:
Burlington House, Piccadilly, London, W1J 0BD.
****************************************************************
website: http://museumscomputergroup.org.uk/
Twitter: http://www.twitter.com/ukmcg
Facebook: http://www.facebook.com/museumscomputergroup
[un]subscribe: http://museumscomputergroup.org.uk/email-list/
****************************************************************
****************************************************************
website: http://museumscomputergroup.org.uk/
Twitter: http://www.twitter.com/ukmcg
Facebook: http://www.facebook.com/museumscomputergroup
[un]subscribe: http://museumscomputergroup.org.uk/email-list/
****************************************************************
Portico: your gateway to information on sites in the National Heritage Collection; have a look and tell us what you think. http://www.english-heritage.org.uk/professional/archives-and-collections/portico/
****************************************************************
website: http://museumscomputergroup.org.uk/
Twitter: http://www.twitter.com/ukmcg
Facebook: http://www.facebook.com/museumscomputergroup
[un]subscribe: http://museumscomputergroup.org.uk/email-list/
****************************************************************
|