The call for participants in ImageCLEF has just come
out.
ImageCLEF this year will create a test collection that will allow
measurement of diversity in search results. We belive this to be the
first publicly available test collection in nearly 10 years that allows
such measurement. Retrieval in ImageCLEF is via text captions, though
image analysis can be performed as well, both monolingual and cross
lingual search tasks will be offered.
If you are interested in submitting runs to this novel test collection,
read on and/or go to the ImageCLEF web site and register.
http://www.imageclef.org/?q=ImageCLEF2008
The photo retrieval task of ImageCLEF2008 will take a different approach
to evaluation by studying image clustering. A good search engine ensures
that duplicate or near duplicate documents retrieved in response to a
query are hidden from the user. Providing this functionality is
particularly important when a user types in a query that is either poorly
specified or ambiguous; a common type of query in image search. Given
such a query, a search engine that retrieves a diverse, yet relevant set
of images is more likely to satisfy its users.
The reason why it's a good idea to promote diversity is because often
different people type in the same query but wish to see different
results. So if a search engine knows nothing about the user entering the
query, a good strategy for the engine is to produce results that are both
diverse and relevant, effectively the engine is spreading its bets on
what the user might want to retrieve.
Perhaps surprisingly almost no test collection exists that examines this
important aspect of search. ImageCLEF will be the first evaluation
campaign to look at this problem in a decade. In order to make
participation in the task as easy as possible, we will use an existing
imageCLEF collection, use its topics, and also keep both the topic and
run format the same from previous years. (In future years we plan to
extend the task to have systems return image clusters and even explore
cluster labelling.)
From a sub set of existing topics on the IAPR TC-12 collection, relevant
images will be manually clustered and relevance judgements will be
augmented to indicate which cluster an image belongs to. Participants
will run the topic sub set on their image search system and produce a
ranking that in the top 10, holds relevant images from as many of the
clusters as possible. A version of the collection will be made available
that allows participants to explore cross language aspects of image
clustering. In this version, members of the clusters will be captioned in
different languages.
Relevance assessors will be instructed to look for simple image clusters
based on the form of a topic. For example if a topic asks for images of
beaches in Brazil, clusters will be formed based on location; if a topic
asks for photos of animals, clusters will be formed based on the type of
animal.
Evaluation will be based on precision at 10 and also on a measure of
cluster recall, which calculates the number of different clusters
retrieved.
Note, it's quite possible to submit runs from a "standard"
non-clustering image search system, though we would expect clustering
systems to out-perform the standard systems.
Participants will need to sign a EULA agreement prior to obtaining the
database.
Mark Sanderson
Reader in Information Retrieval
Room 225, Dept. of Information Studies
University of Sheffield, Regent Court
Portobello St, Sheffield, S1 4DP, UK
Tel: +44 (0) 114 22 22648, Fax: +44 (0) 114 27 80300
mailto:[log in to unmask],
http://dis.shef.ac.uk/mark/
Good judgement comes from experience, experience comes from bad
judgement