Ditto, though we just have to see what the quality is like. Extracting
text from scans via OCR may produce output that's useful for feeding
into a search engine but pretty much unreadable to humans. Even that
should be freed up, I agree, but we can't assume that just because the
"digitised text" exists then there are millions of articles in raw text
ready to be reused as-is.
Tidying them up sounds like a job for crowd-sourcing. Richard mentions
Project Gutenberg, and doubtless Frankie will have other tips on
crowd-sourcing. A big job which ever way you look at it!
Cheers, Jeremy
Jeremy Ottevanger
Web Developer, Museum Systems Team
Museum of London
46 Eagle Wharf Road
London. N1 7ED
Tel: 020 7410 2207
Fax: 020 7600 1058
Email: [log in to unmask]
www.museumoflondon.org.uk
Spectacular new ?20 million Galleries of Modern London opening at Museum of London in spring 2010.
Find out more at www.museumoflondon.org.uk
Before printing, please think about the environment
-----Original Message-----
From: Museums Computer Group [mailto:[log in to unmask]] On Behalf Of
Andy Powell
Sent: 18 June 2009 12:27
To: [log in to unmask]
Subject: Re: [MCG] BL Newspapers and open content
Well, since you asked... :-)
I very strongly agree with Richard that opening up the underlying
content/data should always be seen as a high priority and have made
similar points to the BL previously, e.g.
http://efoundations.typepad.com/efoundations/2008/03/hiding-magna-ca.htm
l
Andy
________________________________
Andy Powell
Research Programme Director
Eduserv
[log in to unmask]
01225 474319 / 07989 476710
www.eduserv.org.uk
efoundations.typepad.com
twitter.com/andypowe11
-----Original Message-----
From: Museums Computer Group [mailto:[log in to unmask]] On Behalf Of
Alastair Dunning
Sent: 18 June 2009 12:22
To: [log in to unmask]
Subject: BL Newspapers and open content
...
If this is a general feeling amongst the MCG that this open data is a
key part of making such content accessible, I'm happy to take these
comments back to the BL's project board for newspapers. And as paying
customers (another interesting issue) it's the kind of thing you might
want to let the BL know about directly.
...
****************************************************************
For mcg information visit the mcg website at
http://www.museumscomputergroup.org.uk.
To manage your subscription to this email list visit
http://www.museumscomputergroup.org.uk/email.shtml
****************************************************************
****************************************************************
For mcg information visit the mcg website at
http://www.museumscomputergroup.org.uk.
To manage your subscription to this email list visit
http://www.museumscomputergroup.org.uk/email.shtml
****************************************************************
|