Hi all
We were pondering an issue last night with accented and special characters in collections search, and wondered if anyone had examples of best practise?
Currently at IWM we treat them uniquely, so a search for cafe gives you 361 results, and a search for café 200 results. There's only an overlap of about ten results which have both variants, so about 550 combined. Even more pronounced is aéroplanes (1 result) and aeroplanes (4900 results).
We're thinking of indexing against both accented and non-accented forms, to ensure something with café also gets indexed for cafe - in other words merging the results. My one concern then is that the user loses granularity and there could be specific examples where quite a precise term gets lost in something more generic (though I can't think of a specific example right now). From a technology point of view it's all based on Solr, so a thought was to somehow push up relevancy ranking for the accented/special character matches.
It's interesting to look at search stats and see that people are quite extensively using accents and special characters, especially for people and place names (and a few for aeroplanes, who must have been quite disappointed!). Also, because of the different collections areas and historic cataloguing, we seem to have a mix of accurate and 'Anglicised' names in our collections data!
Cheers
James
James Morley
Data Developer
Imperial War Museums
Lambeth Road
London SE1 6HZ
[log in to unmask]
07713 360563
iwm.org.uk
@jamesinealing
[cid:image002.jpg@01D1E725.894F3210]
-----------------------------------------------------------------------------------------------------------------------------------------
This email message has been delivered safely and archived online by Mimecast.
For more information please visit http://www.mimecast.com
-----------------------------------------------------------------------------------------------------------------------------------------
****************************************************************
website: http://museumscomputergroup.org.uk/
Twitter: http://www.twitter.com/ukmcg
Facebook: http://www.facebook.com/museumscomputergroup
[un]subscribe: http://museumscomputergroup.org.uk/email-list/
****************************************************************
|