~~~~~~~ BRITISH HCI GROUP NEWS SERVICE ~~~~~~~~~~~
~~ http://www.bcs-hci.org.uk/ ~~
~~ All news to: [log in to unmask] ~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~ NOTE: Please reply to article's originator, ~~
~~ not the News Service ~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
INEX 2004
Initiative for the Evaluation of XML Retrieval
April 2004 - December 2004
Call for participation
http://inex.is.informatik.uni-duisburg.de:2004/
The DELOS Network of Excellence for Digital Libraries invites
participation in an evaluation initiative for XML document retrieval.
The widespread use of the extensible Markup Language (XML), especially
the increasing use of XML in scientific data repositories, Digital
Libraries and on the Web, brought about an explosion in the development
of XML tools, including systems to store and access XML content. The
aim of such retrieval systems is to exploit the logical structure of
documents, which is explicitly represented by the XML markup, and
retrieve document components, instead of whole documents, in response
to a user query. Implementing this, more focused, retrieval paradigm
means that an XML retrieval system needs not only to find relevant
information in the XML documents, but also determine the appropriate
level of granularity to return to the user. In addition, the relevance
of a retrieved component is dependent on meeting both content and
structural conditions.
Evaluating the effectiveness of XML retrieval systems, hence, requires
a test collection where the relevance assessments are provided
according to a relevance criterion, which takes into account the
imposed structural aspects. A test collection as such has been built as
a result of two rounds of the Initiative for the Evaluation of XML
Retrieval (INEX 2002 and INEX 2003). This initiative provides an
opportunity for participants to evaluate their XML retrieval methods
using uniform scoring procedures and a forum for participating
organisations to compare their results. As part of a large-scale effort
to improve the efficiency of research in information retrieval and
digital libraries, this project initiated an international, coordinated
effort to promote evaluation procedures for content-oriented XML
retrieval.
In INEX 2004, participating organisations will be able to compare the
retrieval effectiveness of their XML document retrieval systems and
will contribute to the continuous construction of a large XML test
collection. The test collection will also provide participants a means
for future comparative and quantitative experiments. Due to copyright
issues, only participating organisations will have access to the
constructed test collection.
INEX test collection
The test collection consists of a set of XML documents, topics and
relevance assessments. The topics and the relevance judgments are
obtained through a collaborative effort from the participants. Detailed
guidelines on the on-line topic submission, retrieval result
submission, relevance assessment task, and evaluation metrics will be
provided by INEX.
Documents
The INEX document collection is so far made up of the full-texts,
marked up in XML, of 12,107 articles of the IEEE Computer Society's
publications from 12 magazines and 6 transactions, covering the period
of 1995-2002, and totalling 494 megabytes in size. The collection has a
suitably complex XML structure (192 different content models in DTD)
and contains scientific articles of varying length. On average an
article contains 1,532 XML nodes, where the average depth of a node is
6.9.
Topics
Each participating group will be asked to create a set of candidate
topics, which are representative of the range of real user needs over
the XML collection. The queries may be content-only (CO) or
content-and-structure (CAS) queries, and broad or narrow topic queries.
CO queries are free text queries, like those used in TREC, for which
the retrieval system should retrieve relevant XML elements of varying
granularity, while CAS queries contain explicit structural constraints,
such as containment conditions. From the pooled set of candidate topics
INEX will select a final set of topics to form part of the INEX test
collection
Tasks
The general task, to be performed with the data and the final set of
topics, will be the ad-hoc retrieval of XML documents. Similarly to
information retrieval, we regard ad-hoc retrieval as a simulation of
how a library might be used, where a static set of documents is
searched using a new set of queries (topics). The main differences are
that, in INEX, the library consists of XML documents, the queries may
contain both content and structural conditions and, in response to a
query, arbitrary XML elements may be retrieved from the library.
Participants will be able to submit up to a fixed number of runs, each
containing the top 1500 retrieval results for each of the selected
topics.
INEX will have this year in addition four tracks:
1. Relevance feedback track, dealing with relevance feedback methods
for XML.
2. Natural language track, where natural language formulations of CAS
queries have to be answered.
3. Heterogenous collection track, comprising various XML collections
from different digital libraries, as well as material from other
computer science-related resources.
4. Interactive Track, focusing on interactive XML retrieval,
considering also navigation through the hierarchical structure.
Relevance assessments
Relevance assessments will be provided by the participating groups
using INEX's on-line assessment system. Each assessor will judge 1-2
topics, either the topics that they originally created or if these were
removed from the final set of topics, then topics that were similar to
their original queries. Please note that assessments will take about
one person week per topic. Participating groups will gain access to the
completed INEX test collection only after they have completed their
assessment task.
Evaluation
The evaluation of the retrieval effectiveness of the XML retrieval
engines used by the participants will be based on the constructed INEX
test collection and uniform scoring techniques, including
recall/precision measures, which will take into account the structural
nature of XML documents, and overlap of answers.
Participants will be able to present their approaches and final results
at the INEX 2004 workshop to be held in December in Dagstul. All
results will be published in the INEX workshop proceedings and on the
Web.
Data Handling Agreement
In order to have access to the data designated as the IEEE Computer
Society XML Retrieval Research Collection, organizations (who did not
sign the agreement in 2003) must first fill in a data release
Application Form (to be obtained from the INEX 2004 web site).
Schedulle
April 2: Deadline for the submission of "Application for Participation".
April 02 - 16: The collection of XML documents will be distributed to
all participants on the receipt of their signed data handling
agreement. Participants will also be provided with detailed
instructions and formatting criteria for candidate topics/queries.
May 03: Submission deadline for candidate topics.
May 24: Distribution of final set of topics/queries to participants
along with detailed information on the formatting requirements of the
search results.
August 09: Submission deadline of search results.
August 23: Distribution of merged results to participants for relevance
assessments.
October 08: Submission deadline for relevance assessments.
Nov 01: Distribution of XML test collection and evaluation scores to
participants.
December 1 (tbc): Submission of papers for the workshop pre-proceedings
December 13-15 (tbc): Workshop in Schloss Dagstuhl
(http://www.dagstuhl.de/).
Organisers
Project Leaders
Norbert Fuhr
University of Duisburg-Essen
Email: [log in to unmask]
Mounia Lalmas
Queen Mary University of London
Email: [log in to unmask]
Contact person
Saadia Malik
University of Duisburg-Essen
Email:[log in to unmask]
Topic format specification
Börkur Sigurbjörnsson
University of Amsterdam
Email:[log in to unmask]
Andrew Trotman
University of Otago
Email:[log in to unmask]
Online relevance assessment tool:
Benjamin Piwowarski
Université Paris 6,France
Email:[log in to unmask]
Metrics:
Gabriella Kazai
Queen Mary University of London
Email:[log in to unmask]
Arjen P. de Vries
CWI, The Netherlands
Email:[log in to unmask]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~ To receive HCI news, send the message: ~~
~~ "JOIN BCS-HCI your_firstname your_lastname" ~~
~~ to [log in to unmask] ~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~ Newsarchives: ~~
~~ http://www.jiscmail.ac.uk/lists/bcs-hci.html ~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~ To join the British HCI Group, contact ~~
~~ [log in to unmask] ~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|