Paul Hollands wrote:
> Hi Andy,
>
> The use of hyphens in the ISBN and ISSN examples may cause some
> confusion. My understanding is that ISBNs/ISSNs are agnostic about the
> hyphens and they are included only for reasons of legibility when
> printed on book spines etc.. I don't believe MARC cataloguers are
> required to use them (could be wrong). Maybe a note about the ISBNs and
> ISSNs being valid with or without them might be useful?? Or am I totally
> mistaken?
I don't think(?) you are mistaken - I've been doing some investigation
of this, as we have a few hundred ISBNs in our database that are encoded
without hyphenation.
My first thought was that we could correct this with a regex of SQL
substring statement, but the problem is that the middle two strings in
the ISBN: n - XXXX - XXXX - n/X, publisher and book, are of variable
length so this can not be automated in this way. This link explains
the format clearly:
http://www.isbn.org/standards/home/isbn/international/html/usm4.htm
The ISBNs ended up in our database in the first place without
hyphenation because some were retrieved from Amazon's SOAP service, and
Amazon returns ISBNs sans hyphenation.
My second thought was pondering some form of lookup service to library
databases, but marc records do not appear to contain hyphenation (as
Paul suggests), so this is not a goer either (I've checked a few library
catalogues and there is no hyphenation).
My third thought was the checksum information. This is explained clearly
here:
http://www.morovia.com/education/utility/upc-ean.asp (scroll to bottom)
However, whilst I am not of a mathematical bent, there doesn't seem any
way to reverse the process because if you look at the weighting table it
makes no particular distinction regarding the publisher/book split.
Maybe somebody is cleverer than me here?
I'm not seeing any way to recover the information (ie
hyphenation/spaces) in a programmatic way, nor even anywhere on the web
where I can look up our books and get hyphenated information returned
(anybody know of anywhere ?)
A trip to the library and taking the books off the shelves is the only
solution I see at the moment.
Cheers
Nik
>
> Cheers.
>
> Andy Powell wrote:
>
>> I don't think I've formally announced the document at
>>
>> http://www.ukoln.ac.uk/metadata/dcmi-ieee/identifiers/
>>
>> though it did get briefly mentioned in a previous thread about
>> identifiers
>> on this list. Anyway, I've just been going over the document to tidy up
>> one or two loose ends so it seems appropriate to mention it again :-)
>>
>> It provides guidelines for encoding a number of commonly used
>> identifiers
>> in DC metadata and IEEE LOM records and recommends always encoding
>> identifiers as URIs.
>>
>> As always, comment, corrections, etc. are very welcome.
>>
>> Andy
>> --
>> Distributed Systems, UKOLN, University of Bath, Bath, BA2 7AY, UK
>> http://www.ukoln.ac.uk/ukoln/staff/a.powell/ +44 1225 383933
>> Resource Discovery Network http://www.rdn.ac.uk/
>> ECDL 2004, Bath, UK - 12-17 Sept 2004 - http://www.ecdl2004.org/
>>
>>
>
> --
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Paul Hollands <[log in to unmask]>
> LTSN-01 Information and Web Support Officer
> University of Newcastle, 16/17 Framlington Place
> Newcastle upon Tyne, NE2 4AB
> 0191 222 5888
> <http://www.ltsn-01.ac.uk/>
|