I've been pondering the current push for standardised terminology in
Museum classification so that different Museums' IT systems can be
interfaced, and I'm wondering how much of this is really thought-through
... Our vision for the "future" of Museum IT seems to be based onthe
needs of ecommerce systems, or on how information technology was taught
in the 1970's.
Museums don't usually need to be able to bulk-exchange data with each
other, because they're not wholesaling or retailing each others
exhibits, or buying and selling in bulk from each other and expecting
their inventory to be automatically updated. They're not Amazon or
iTunes. The reason for making information available online is usually
for the benefit of end-users, and those end-users are primarily
interested in finding out about exhibits and finding other similar
information. They don't need to add or remove exhibits from the system,
or transfer entries to a different museum's site. If they're not moving
entries between systems, then deep data-compatibility isn't really a
requirement. If your museum is going to be taken over or merged with
another, then it's handy if the organisation that takes over your
collection can integrate your database with theirs, but otherwise, it's
a bit difficult to see the immediate payoff.
If what we want is search and discovery, then structured XML databases
become less critical, and the specialist dedicated tools for the job are
... search engines. Good search engines look for patterns in the data
and find their own sets of associated keywords and cross-references
without needing webpage authors to standardise on a specific keyphrase.
If you search for "747 aeroplane", Google will report Wikipedia's page
on the Boeing 747 as the highest ranking result, even though one of the
two selected words, "aeroplane" doesn't actually exist anywhere in the
page. Google knows from context that different pages that include "747"
seem to use "aeroplane", "aircraft" and "airplane" in the same way, and
it makes the association that, in this type of search, the words are
probably interchangeable. Google also doesn't need those keywords to be
explicitly structured in the source text (although it probably helps).
Google also has access to semantic structures via sites like Freebase
that can tell it that "747" is a type of "airplane" / "aircraft" /
"aeroplane", which is a type of "transport", and which is associated
with a "manufacturer" called "Boeing", so it can draw on these logical
associations and use them to guess at the meaning of museum webpages
without needing those pages to include their own semantic tagging.
Explicit semantic tagging probably /helps/, but one of the points of
teaching Google about semantics separately was that it could then use
that knowledge to analyse /any/ webpage, instead of requiring thousands
of individual webpage authors to go off and take special training
courses in standardised terminology, to be able to write pages that
Google can understand. That seems to be the equivalent of what we're
asking museum staff to do, with the added downside that once they put
all the effort into structuring their data to allow it to be more
compatible with some hypothetical inter-museum system ... they find that
no such system seems to exist. I'm not even sure that anyone's even
planning on producing one, or setting up the organisation to run one, or
sorting out what the rules would be if one existed.
So, if our supposed goal is to let people cross-reference and search for
similar items across the Museum network, perhaps what we should have
been concentrating on is a cross-site "Museum search" project. The
end-user doesn't need any of those museum computers to be talking to
each other, as long as there's a search engine button or widget that can
be launched from any of the sites that gives access to pooled results
for all of them.
As far as I can tell, the reason why we haven't done this is because the
Museum IT community has been focusing on XML as the exclusive answer to
everything, because XML is nice and technical, it lets them impose
order, and it requires IT people to understand it so it makes Museums
more dependent on IT people (which IT people probably feel is a Good
Thing). XML-based initiatives generate IT jobs, and IT training jobs,
and IT support jobs, and if you can lobby the standards committees and
get your XML based system or scheme made compulsory as a condition for
certification, then museums have to keep paying you, indefinitely.
They're locked in, even if your complex system doesn't actually do
anything especially useful.
So perhaps the problem with a search engine initiative is that it might
work /too/ well. It might be too quick, easy, cheap, effective and
popular. If the search engine functionality is being implemented at a
single point, you don't need specially-trained IT staff duplicated at
every single museum entering data in a special way that the IT system
requires. Sure, if you /want/ to use explicit XML tagging, that might
give you a boost in the search engine rankings because the engine will
have a higher confidence in its analysis of a page, but if you just
write a simple webpage about an exhibit, and tell the dedicated Museum
search engine that it exists, then there's a good chance that the engine
will be able to do a good speculative cross-reference without needing a
single line of custom code.
---------------
One way of implementing this would be to have an "Exhibits" widget that
a webmaster could embed on any page that's about a single Museum
exhibit, which would then register that page with the search engine when
it's loaded, and give the user "Search for similar items on this Museum"
and "Search for similar items in other museums" options. Maybe also an
"I like this exhibit" button, a button to look for the current ranked
favourites in the site, and a star rating based on how well that exhibit
is ranked on the site.
From the Museum's point of view, this wouldn't seem to have to be any
more difficult than embedding an existing social media widget, and for
many museums it might work well enough with existing content to make
more ambitious semantic tagging projects unnecessary. If someone's
looking at a page with "Steiff" and "teddy bear" in a heading, a
dedicated "Museum search" for pages with similar content probably
doesn't need those keywords to be semantically tagged to be able to find
other Steiff bear exhibits. Additional structure would be nice, but
usually unnecessary.
If you want to get more fancy, you could have a Class="Exhibit"
identifier that could be put into the enclosing div or table code, to
say that only the contents of that particular panel are relevant, so
that the search engine doesn't try to index all the surrounding
navigation bars etc. If you wanted multiple exhibits on a page, they
could have their own widgets and isolating panels. But that could be a
later development if people wanted it.
I do like the idea of having everything XML-tagged on principle, and I
think it's a good goal to aim for. But if we're serious about wanting to
let users do cross-museum searches, XML seems to be the foundation work
for a very sophisticated house that nobody's intending to build. If we
honestly do want cross-museum searches, we can have it without a lot of
work, but the limiting factor is people, not technology.
OTOH, if we actually don't care too much about the ability to do
cross-museum searches, feel that maybe they won't be all that useful,
and aren't too bothered if the feature never appears, then that's okay
... as long as we're honest with ourselves about it.
Eric
****************************************************************
website: http://museumscomputergroup.org.uk/
Twitter: http://www.twitter.com/ukmcg
Facebook: http://www.facebook.com/museumscomputergroup
[un]subscribe: http://museumscomputergroup.org.uk/email-list/
****************************************************************
|