Provenance data is poised to become pervasive in key areas of information management, ranging from traditional areas of science (i.e., life sciences, earth sciences, astronomy, etc.), to new applications
enabled by the Web (e.g., social sciences, social network analysis, quality and trust in Web publishing).
As the volume of provenance metadata increases with the volume of the underlying data whose history it describes, new challenges for
managing and querying provenance at scale emerge, i.e., provenance data is growing in both "count" and "complexity". It is growing in count because of the very large number of provenance traces (one
for each Twitter message, for example), and in complexity in the case of provenance graphs that are generated from provenance-enabled programming environments (e.g.,
scientific workflow systems) and middleware. Data-intensive science is bound to produce provenance that fares high on both accounts.
At the same time, emerging standards such as PROV, the W3C recommendation
for provenance modelling and Web-based access, suggest that provenance data will increasingly be encoded using Semantic Web technology. This in turn suggests that provenance data will soon form a natural extension of, and seamlessly blend with, the growing
Linked Data Cloud.
The new Managing and Querying Provenance Data at Scale workshop (BIGProv) stems from these premises. We are interested in exploring the system and modelling challenges associated with
collecting, storing, querying, and exploiting large volumes of possibly complex provenance data. We seek to map the state of the art, elicit new research problems, and learn about existing systems. More specifically, the workshop scope includes the following
topics:
Our primary goal is to generate an interesting and lively discussion. Thus, we envision a variety of contributions, small and large, reporting on prototype
systems or performance analysis, as well as work in progress, and position or vision papers. Submissions are encouraged in two categories:
Submissions should be formatted using the ACM Proceedings
format, will be peer-reviewed, and will be included in the official EDBT workshop proceedings.
Submissions will be managed through EasyChair (Submissions
site).
Additionally, authors are encouraged to also present a poster of their work, possibly jointly with the main EDBT poster session (to be
confirmed).
Co-chairs:
Bertram Ludaescher, UC Davis, CA ([log in to unmask])
Paolo Missier, Newcastle University, UK ([log in to unmask])
Proceedings chair: Victor Cuevas, University of New Mexico and UC Davis, USA
Contact: [log in to unmask]