Hi Everyone,
The treatment of diacriticals is also a big problem in bibliographies, where you don't really have the option of replacing them with Anglicised terms - but you can include a translation, possibly in [square brackets]. Programs such as Endnote will store the diacriticals, but will also return the words if you search without the diacriticals: e.g. Törnquist,, S.L. will be returned if you search for Törnquist or Tornquist. I would recommend configuring your search to work in this manner.
However, technically Törnquist is transliterated to Toernquist - and this wouldn't be picked up.
In then gets even more complicated - Norwegian, for example, uses ø instead of ö and æ instead of ä and they treat these as extra letters following z. They also have a third additional letter å. Just when you've got used to finding these at the end of the dictionary, you could find a Danish double Aa, which is the same as å and also comes at the end of the dictionary.
My advice: Enter the object name exactly as it is spelt in its original language, add an English translation and transliteration in [square brackets], and configure your search engine to ignore diacriticals (if you can).
Good luck!
Mike
Dr Mike Howe
Chief Curator
Head of the National Geological Repository
Phone: 0115 9363105 Email: [log in to unmask]
Web: http://www.bgs.ac.uk/staff/profiles/3858.html
WSB UGN - British Geological Survey
Keyworth, Nottingham, NG12 5GG
-----Original Message-----
From: Museums Computer Group [mailto:[log in to unmask]] On Behalf Of Robin Patel
Sent: 26 July 2016 12:28
To: [log in to unmask]
Subject: Re: [MCG] Accented and special characters in collections search
Hi James,
My knowledge is somewhat simple in this field, but would it not be easier to search and replace all object names that use special characters and replacing them with Anglicised terms? Perhaps the 'correct' term could be stored under a 'related' terms field? This is similar to using equivalent names for objects in different languages e.g. the Gaelic name for an object.
Am I correct in assuming from a usability point of view, it's highly unlikely that people would search using special characters? Knowing how to input special character when typing is a challenge in itself!
Robin
--
Robin Patel
Ergadia Museums & Heritage
t: 01786 860 691
m: 07815 312 562
[log in to unmask]
https://ergadiaheritage.com/
On 26 July 2016 at 10:07, James Morley <[log in to unmask]> wrote:
> Hi all
>
> We were pondering an issue last night with accented and special
> characters in collections search, and wondered if anyone had examples of best practise?
>
> Currently at IWM we treat them uniquely, so a search for cafe gives
> you
> 361 results, and a search for café 200 results. There's only an
> overlap of about ten results which have both variants, so about 550
> combined. Even more pronounced is aéroplanes (1 result) and aeroplanes (4900 results).
>
> We're thinking of indexing against both accented and non-accented
> forms, to ensure something with café also gets indexed for cafe - in
> other words merging the results. My one concern then is that the user
> loses granularity and there could be specific examples where quite a
> precise term gets lost in something more generic (though I can't think
> of a specific example right now). From a technology point of view
> it's all based on Solr, so a thought was to somehow push up relevancy
> ranking for the accented/special character matches.
>
> It's interesting to look at search stats and see that people are quite
> extensively using accents and special characters, especially for
> people and place names (and a few for aeroplanes, who must have been
> quite disappointed!). Also, because of the different collections areas
> and historic cataloguing, we seem to have a mix of accurate and 'Anglicised'
> names in our collections data!
>
> Cheers
>
> James
>
>
> James Morley
> Data Developer
>
> Imperial War Museums
> Lambeth Road
> London SE1 6HZ
>
> [log in to unmask]
> 07713 360563
> iwm.org.uk
> @jamesinealing
>
>
> [cid:image002.jpg@01D1E725.894F3210]
>
>
> ----------------------------------------------------------------------
> -------------------------------------------------------------------
> This email message has been delivered safely and archived online by
> Mimecast.
> For more information please visit http://www.mimecast.com
>
> ----------------------------------------------------------------------
> -------------------------------------------------------------------
>
> ****************************************************************
> website: http://museumscomputergroup.org.uk/
> Twitter: http://www.twitter.com/ukmcg
> Facebook: http://www.facebook.com/museumscomputergroup
> [un]subscribe: http://museumscomputergroup.org.uk/email-list/
> ****************************************************************
>
****************************************************************
website: http://museumscomputergroup.org.uk/
Twitter: http://www.twitter.com/ukmcg
Facebook: http://www.facebook.com/museumscomputergroup
[un]subscribe: http://museumscomputergroup.org.uk/email-list/
****************************************************************
________________________________
This message (and any attachments) is for the recipient only. NERC is subject to the Freedom of Information Act 2000 and the contents of this email and any reply you make may be disclosed by NERC unless it is exempt from release under the Act. Any material supplied to NERC may be stored in an electronic records management system.
________________________________
****************************************************************
website: http://museumscomputergroup.org.uk/
Twitter: http://www.twitter.com/ukmcg
Facebook: http://www.facebook.com/museumscomputergroup
[un]subscribe: http://museumscomputergroup.org.uk/email-list/
****************************************************************
|