JISCMail - IR Archives

The call for participants in ImageCLEF has just come out.

ImageCLEF this year will create a test collection that will allow measurement of diversity in search results. We belive this to be the first publicly available test collection in nearly 10 years that allows such measurement. Retrieval in ImageCLEF is via text captions, though image analysis can be performed as well, both monolingual and cross lingual search tasks will be offered.

If you are interested in submitting runs to this novel test collection, read on and/or go to the ImageCLEF web site and register.

http://www.imageclef.org/?q=ImageCLEF2008

The photo retrieval task of ImageCLEF2008 will take a different approach to evaluation by studying image clustering. A good search engine ensures that duplicate or near duplicate documents retrieved in response to a query are hidden from the user. Providing this functionality is particularly important when a user types in a query that is either poorly specified or ambiguous; a common type of query in image search. Given such a query, a search engine that retrieves a diverse, yet relevant set of images is more likely to satisfy its users.

The reason why it's a good idea to promote diversity is because often different people type in the same query but wish to see different results. So if a search engine knows nothing about the user entering the query, a good strategy for the engine is to produce results that are both diverse and relevant, effectively the engine is spreading its bets on what the user might want to retrieve.

Perhaps surprisingly almost no test collection exists that examines this important aspect of search. ImageCLEF will be the first evaluation campaign to look at this problem in a decade. In order to make participation in the task as easy as possible, we will use an existing imageCLEF collection, use its topics, and also keep both the topic and run format the same from previous years. (In future years we plan to extend the task to have systems return image clusters and even explore cluster labelling.)

From a sub set of existing topics on the IAPR TC-12 collection, relevant images will be manually clustered and relevance judgements will be augmented to indicate which cluster an image belongs to. Participants will run the topic sub set on their image search system and produce a ranking that in the top 10, holds relevant images from as many of the clusters as possible. A version of the collection will be made available that allows participants to explore cross language aspects of image clustering. In this version, members of the clusters will be captioned in different languages.

Relevance assessors will be instructed to look for simple image clusters based on the form of a topic. For example if a topic asks for images of beaches in Brazil, clusters will be formed based on location; if a topic asks for photos of animals, clusters will be formed based on the type of animal.

Evaluation will be based on precision at 10 and also on a measure of cluster recall, which calculates the number of different clusters retrieved.

Note, it's quite possible to submit runs from a "standard" non-clustering image search system, though we would expect clustering systems to out-perform the standard systems.

Participants will need to sign a EULA agreement prior to obtaining the database.

Mark Sanderson
Reader in Information Retrieval

Room 225, Dept. of Information Studies
University of Sheffield, Regent Court
Portobello St, Sheffield, S1 4DP, UK
Tel: +44 (0) 114 22 22648, Fax: +44 (0) 114 27 80300
mailto:[log in to unmask], http://dis.shef.ac.uk/mark/

Good judgement comes from experience, experience comes from bad judgement