Hi,
I was encouraged by the release of the Tate artist and artwork data to
see what I could do with it from a Linked Data perspective. You can see
the results at [1] (assuming that my home machine can cope). I don't
claim that the site is pretty, but it does have some useful
functionality, e.g. "born on this day", map, timeline, outward links.
The main purpose of this post is to start a conversation about what
counts as useful information when it comes to data sharing.
The actual downloads provided by Tate (CSV files) were perfectly usable;
it didn't take me long to convert them to XML and load them into My
Favourite Database. However, I found that the information was a bit
"thin" - e.g. places of birth and death are "place name, country" and
dates are year-only. I used a "web termlist" to consult Geonames, and
particularly for US places there could be as many as 100 places to
choose from. So I was often reduced to running a web search for the
artist in question, in order to know which place to pick. In a parallel
exercise, I queried dbpedia for the artist's name, and picked potential
matches by matching on birth and death dates. This hasn't been an exact
science, and there may well still be links to entirely spurious dbpedia
entries in the data. However, where it works I have usually benefitted
by getting d.m.y dates from the dbpedia data, which I could then merge
into my main artist file.
Having exact dates allows me to add the "on this day ..." feature.
Linking places to Geonames identifiers gives me access to lat/long
coordinates, which in turn supports the map view. (Don't try the
map/timeline view until you have run a search. It /will /attempt to
display "pins" for all 3,527 artists, but it will never get there. :-) )
In the single-record view (which you also have to wait for), I can use
the Tate's built-in linking to reliably retrieve artworks by that
artist, but when I attempt to bring in works from Culture Grid I have to
use a speculative search based on the artist's name. The results will
clearly be variable. Ah, if only there were some Unified List of
Artists' Names, used in everyone's data, which one could call upon to
improve cross-collection linking ...
The single-record view also exploits the dbpedia links, where they
exist. As well as the summary, it provides a list of artists who have
influenced or been influenced by the artist in question. These are just
boring links in my site, but I could equally have looked up these
records and brought back some details about the people in question. I
also attempt a SPARQL query on the dbpedia data, and list some people
born in the same place as the artist. Unfortunately this uses all place
keywords, including country, and so the results aren't particularly
enlightening. However, it does point up what could be achieved with
better, more domain-specific, data to query (e.g. in ResearchSpace?).
Which brings me to the main point of the exercise, which was to assign
dereferenceable Linked Data identifiers to each artist, so they can be
referenced unambiguously. A typical example is [2], which does all the
right Cool URIs things, and delivers HTML, XML and RDF if asked nicely.
If you follow the top link from the detail page, you will arrive at the
RDF variant [4] of that Linked Data.
In conclusion, I think we need URLs to express information about our
domain: cultural heritage. For some of these we can use generic services
like Geonames, but for many aspects we will need to mint and /share/ URL
frameworks which are specific to our requirements as a community.
Richard
[1] http://light.demon.co.uk/wordpress/?page_id=699
[2] http://light.demon.co.uk/Person/tate-artists/id/1678
[3] http://www.w3.org/TR/cooluris/
[4] http://light.demon.co.uk/Person/tate-artists/id/rdf/1678
--
*Richard Light*
****************************************************************
website: http://museumscomputergroup.org.uk/
Twitter: http://www.twitter.com/ukmcg
Facebook: http://www.facebook.com/museumscomputergroup
[un]subscribe: http://museumscomputergroup.org.uk/email-list/
****************************************************************
|