JISCMail - MCG Archives

Email discussion lists for the UK Education and Research communities

Subscriber's Corner

Email Lists

MCG Archives

MCG@JISCMAIL.AC.UK

View:

Message:

[

First

Last

]

By Topic:

[

First

Last

]

By Author:

[

First

Last

]

Font:

Proportional Font

		LISTSERV Archives
		MCG Home
		MCG September 2011

Options

Subscribe or Unsubscribe

Get Password

Subject:

Re: How about a Museums-only search engine?

From:

"Birchall, Danny" <[log in to unmask]>

Reply-To:

Museums Computer Group <[log in to unmask]>

Date:

Tue, 6 Sep 2011 09:15:54 +0100

Content-Type:

text/plain

Parts/Attachments:

text/plain (65 lines)

>> The end-user doesn't need any of those museum computers to be talking to each other, as long as there's a search engine button or widget that can be launched from any of the sites that gives access to pooled results for all of them. 

But why on earth, would I, as an end-user want results that only come from museums, rather than results for what I'm actually looking for from museums, blogs, photo pools, and other places on the web? The only people I know who are generically interested in what 'museums' have to offer are people who work for museums.




-----Original Message-----
From: Museums Computer Group [mailto:[log in to unmask]] On Behalf Of Eric Baird
Sent: 05 September 2011 19:33
To: [log in to unmask]
Subject: [MCG] How about a Museums-only search engine?

I've been pondering the current push for standardised terminology in Museum classification so that different Museums' IT systems can be interfaced, and I'm wondering how much of this is really thought-through ... Our vision for the "future" of Museum IT seems to be based onthe needs of ecommerce systems, or on how information technology was taught in the 1970's.

Museums don't usually need to be able to bulk-exchange data with each other, because they're not wholesaling or retailing each others exhibits, or buying and selling in bulk from each other and expecting their inventory to be automatically updated. They're not Amazon or iTunes. The reason for making information available online is usually for the benefit of end-users, and those end-users are primarily interested in finding out about exhibits and finding other similar information. They don't need to add or remove exhibits from the system, or transfer entries to a different museum's site. If they're not moving entries between systems, then deep data-compatibility isn't really a requirement. If your museum is going to be taken over or merged with another, then it's handy if the organisation that takes over your collection can integrate your database with theirs, but otherwise, it's a bit difficult to see the immediate payoff.


If what we want is search and discovery, then structured XML databases become less critical, and the specialist dedicated tools for the job are ... search engines. Good search engines look for patterns in the data and find their own sets of associated keywords and cross-references without needing webpage authors to standardise on a specific keyphrase. 
If you search for "747 aeroplane", Google will report Wikipedia's page on the Boeing 747 as the highest ranking result, even though one of the two selected words, "aeroplane" doesn't actually exist anywhere in the page. Google knows from context that different pages that include "747" 
seem to use "aeroplane", "aircraft" and "airplane" in the same way, and it makes the association that, in this type of search, the words are probably interchangeable. Google also doesn't need those keywords to be explicitly structured in the source text (although it probably helps). 
Google also has access to semantic structures via sites like Freebase that can tell it that "747" is a type of "airplane" / "aircraft" / "aeroplane", which is a type of "transport", and which is associated with a "manufacturer" called "Boeing", so it can draw on these logical associations and use them to guess at the meaning of museum webpages without needing those pages to include their own semantic tagging.
Explicit semantic tagging probably /helps/, but one of the points of teaching Google about semantics separately was that it could then use that knowledge to analyse /any/ webpage, instead of requiring thousands of individual webpage authors to go off and take special training courses in standardised terminology, to be able to write pages that Google can understand. That seems to be the equivalent of what we're asking museum staff to do, with the added downside that once they put all the effort into structuring their data to allow it to be more compatible with some hypothetical inter-museum system ... they find that no such system seems to exist. I'm not even sure that anyone's even planning on producing one, or setting up the organisation to run one, or sorting out what the rules would be if one existed.

So, if our supposed goal is to let people cross-reference and search for similar items across the Museum network, perhaps what we should have been concentrating on is a cross-site "Museum search" project. The end-user doesn't need any of those museum computers to be talking to each other, as long as there's a search engine button or widget that can be launched from any of the sites that gives access to pooled results for all of them.
As far as I can tell, the reason why we haven't done this is because the Museum IT community has been focusing on XML as the exclusive answer to everything, because XML is nice and technical, it lets them impose order, and it requires IT people to understand it so it makes Museums more dependent on IT people (which IT people probably feel is a Good Thing). XML-based initiatives generate IT jobs, and IT training jobs, and IT support jobs, and if you can lobby the standards committees and get your XML based system or scheme made compulsory as a condition for certification, then museums have to keep paying you, indefinitely. 
They're locked in, even if your complex system doesn't actually do anything especially useful.

So perhaps the problem with a search engine initiative is that it might work /too/ well. It might be too quick, easy, cheap, effective and popular. If the search engine functionality is being implemented at a single point, you don't need specially-trained IT staff duplicated at every single museum entering data in a special way that the IT system requires. Sure, if you /want/ to use explicit XML tagging, that might give you a boost in the search engine rankings because the engine will have a higher confidence in its analysis of a page, but if you just write a simple webpage about an exhibit, and tell the dedicated Museum search engine that it exists, then there's a good chance that the engine will be able to do a good speculative cross-reference without needing a single line of custom code.


---------------

One way of implementing this would be to have an "Exhibits" widget that a webmaster could embed on any page that's about a single Museum exhibit, which would then register that page with the search engine when it's loaded, and give the user "Search for similar items on this Museum" 
and "Search for similar items in other museums" options. Maybe also an "I like this exhibit" button, a button to look for the current ranked favourites in the site, and a star rating based on how well that exhibit is ranked on the site.


From the Museum's point of view, this wouldn't seem to have to be any more difficult than embedding an existing social media widget, and for many museums it might work well enough with existing content to make more ambitious semantic tagging projects unnecessary. If someone's looking at a page with "Steiff" and "teddy bear" in a heading, a dedicated "Museum search" for pages with similar content probably doesn't need those keywords to be semantically tagged to be able to find other Steiff bear exhibits. Additional structure would be nice, but usually unnecessary.

If you want to get more fancy, you could have a Class="Exhibit" 
identifier that could be put into the enclosing div or table code, to say that only the contents of that particular panel are relevant, so that the search engine doesn't try to index all the surrounding navigation bars etc. If you wanted multiple exhibits on a page, they could have their own widgets and isolating panels. But that could be a later development if people wanted it.

I do like the idea of having everything XML-tagged on principle, and I think it's a good goal to aim for. But if we're serious about wanting to let users do cross-museum searches, XML seems to be the foundation work for a very sophisticated house that nobody's intending to build. If we honestly do want cross-museum searches, we can have it without a lot of work, but the limiting factor is people, not technology.


OTOH, if we actually don't care too much about the ability to do cross-museum searches, feel that maybe they won't be all that useful, and aren't too bothered if the feature never appears, then that's okay ... as long as we're honest with ourselves about it.
Eric

****************************************************************
       website:  http://museumscomputergroup.org.uk/
       Twitter:  http://www.twitter.com/ukmcg
      Facebook:  http://www.facebook.com/museumscomputergroup
 [un]subscribe:  http://museumscomputergroup.org.uk/email-list/
****************************************************************


This message has been scanned for viruses by Websense Hosted Email Security - www.websense.com

****************************************************************
       website:  http://museumscomputergroup.org.uk/
       Twitter:  http://www.twitter.com/ukmcg
      Facebook:  http://www.facebook.com/museumscomputergroup
 [un]subscribe:  http://museumscomputergroup.org.uk/email-list/
****************************************************************

Top of Message | Previous Page | Permalink

JiscMail Tools

Files Area | help

RSS Feeds and Sharing

Search Archives

Advanced Options

Archives

May 2024
April 2024
March 2024
February 2024
January 2024
December 2023
November 2023
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
December 2022
November 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
2006
2005
2004
2003
2002
2001
2000
1999
1998

JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk