JISCMail - DC-SCIENCE Archives

International Workshop on

Managing and Querying Provenance Data at Scale

Held in conjunction with EDBT/ICDT 2013

March 22nd, 2013, Genova, Italy

New: we are also collecting provenance traces: please see the companion ProvBenchcall and submit your traces!

Motivation and Focus.

Provenance data is poised to become pervasive in key areas of information management, ranging from traditional areas of science (i.e., life sciences, earth sciences, astronomy, etc.), to new applications enabled by the Web (e.g., social sciences, social network analysis, quality and trust in Web publishing).

As the volume of provenance metadata increases with the volume of the underlying data whose history it describes, new challenges for managing and querying provenance at scale emerge, i.e., provenance data is growing in both "count" and "complexity". It is growing in count because of the very large number of provenance traces (one for each Twitter message, for example), and in complexity in the case of provenance graphs that are generated from provenance-enabled programming environments (e.g., scientific workflow systems) and middleware. Data-intensive science is bound to produce provenance that fares high on both accounts.

At the same time, emerging standards such as PROV, the W3C recommendation for provenance modelling and Web-based access, suggest that provenance data will increasingly be encoded using Semantic Web technology. This in turn suggests that provenance data will soon form a natural extension of, and seamlessly blend with, the growing Linked Data Cloud.

The new Managing and Querying Provenance Data at Scale workshop (BIGProv) stems from these premises. We are interested in exploring the system and modelling challenges associated with collecting, storing, querying, and exploiting large volumes of possibly complex provenance data. We seek to map the state of the art, elicit new research problems, and learn about existing systems. More specifically, the workshop scope includes the following topics:

Automated capture of provenance at multiple layers (system, middleware, applications)
Database models, languages, and systems for storing and querying large-scale provenance
Provenance and Linked Open Data (LOD): seamless representation and query models
Comparison and performance benchmarking of different data architectures and query models for provenance
Analysis of existing graph query models and systems for provenance graphs
Reference datasets for provenance benchmarking
System descriptions and demonstrations of large-scale provenance and graph data
Uniform querying over heterogeneous provenance traces
Abstraction models for provenance and their applications to user presentation, visualization, and privacy preservation

Workshop format and submission instructions.

Our primary goal is to generate an interesting and lively discussion. Thus, we envision a variety of contributions, small and large, reporting on prototype systems or performance analysis, as well as work in progress, and position or vision papers. Submissions are encouraged in two categories:

short papers (up to 4 pages)
regular papers (up to 8 pages)
New: we also accept extended abstracts (2 pages) describing traces submitted through the ProvBenchcommunity initiative. Submissions that arrive in time for the regular paper deadline will be included in the regular proceedings

Submissions should be formatted using the ACM Proceedings format, will be peer-reviewed, and will be included in the official EDBT workshop proceedings.

Submissions will be managed through EasyChair (Submissions site).

Additionally, authors are encouraged to also present a poster of their work, possibly jointly with the main EDBT poster session (to be confirmed).

Important Dates:

Paper submission: Dec. 1, 2012
Notification to authors: Jan 11, 2013
Jan 23 Deadline for camera-ready copy
March 22: Workshop

Workshop Organizers and Contacts

Co-chairs:

Bertram Ludaescher, UC Davis, CA ([log in to unmask])
Paolo Missier, Newcastle University, UK ([log in to unmask])

Proceedings chair: Victor Cuevas, University of New Mexico and UC Davis, USA

Contact: [log in to unmask]