Tutorial on RDF-Stream Processing
The tutorial provides a comprehensive view of the RDF-Stream Processing (RSP) research area. It consists of four parts. The first one introduces the RSP basic concepts: RDF streams to represent temporally-ordered sequence of data items; continuous SPARQL extensions to query RDF streams, and RSP engines to execute continuous query answering over RDF streams. The second part presents the available RSP engine implementations. It starts with an overview on the existing RSP engines, highlighting similarities and differences among them. Next, two existing implementations are analysed in depth: C-SPARQL and SPARQLstream. The third part is a hands-on session where the attendees learn how to (1) use the two presented RSP engines presented above and (2) let the systems interact among them. Finally, the fourth part of the tutorial provides an overview on RSP-related topics: RSP engine benchmarking and real-world deployments. The tutorial closes with a discussion on the RSP optimisation techniques, open challenges and the research problems of this research field.
Nowadays, more and more dynamic information is becoming available to decision makers in the form of continuous data streams. These data streams occur in a variety of modern applications, such as network monitoring, traffic engineering, sensor networks, RFID tags, micro posts, telecom records, Web logs, click streams, etc. The online and continuous query answering over highly dynamic and heterogeneous data has received attention only very recently. The combination of query answering techniques with data streams gives rise to RDF-Stream Processing, a high impact research area that has already produced results that are relevant for both the semantic and data processing communities.
This tutorial aims at introducing different existing approaches for querying over RDF data streams, and providing the audience with an overview of techniques and tools that can be used for this purpose. The contents of this tutorial are relevant for ISWC as it focuses on the query answering in the context of streaming data that is ubiquitous in a large number of applications on the Web. Attendees will learn the main concepts of RSP (e.g., what is a window and how it works), and they will learn how to write and execute continuous queries over RDF streams.
The learning outcomes are:
- an overview on the basic concepts in RDF stream processing: extensions to RDF and SPARQL to model data streams and query them in a continuous fashion;
- an overview (and a comparison ) of the most relevant available implementations, with a particular focus on C-SPARQL and SPARQLstream (that will be used during the hands-on sessions).
The event targets researchers and practitioners interested in approaching the topic of semantic stream processing and that want to understand the current state of the art as well as the future directions. For this reason, the expected background knowledge is on basic concepts of RDF and SPARQL. The technologies and topics on this tutorial are relevant for people from IoT and sensor communities, as well as social media, pervasive health, oil industry, etc, who produce massive amounts of streaming data.
09.00 - 09.30 Introduction to RDF-stream processing (30m) [slides] 09.30 - 10.30 The RSP-QL model (60m) [slides] 10.30 - 11.00 Coffee break 11.00 - 12.30 Implementations of RSP models (90m) [slides] 12.30 - 14.00 Lunch break 14.00 - 14.45 Publishing RDF streams on the Web (45m) [slides] 14.45 - 15.30 Hands-on session (45m) [slides] 15.30 - 16.00 Coffee break 16.00 - 17.00 RDF Stream Optimisation Techniques (60m) [slides] 17.00 - 17.30 Open research problems, wrap-up and discussion (30m) [slides]
Muhammad Intizar Ali, The Insight Centre for Data Analytics, National University of Ireland, Galway Jean-Paul Calbimonte, HES-SO Valais, Switzerland Daniele Dell’Aglio, Department of Informatics, University of Zurich & Dipartimento di Elettronica, Informatica e Bioingegneria, Politecnico di Milano Emanuele Della Valle, Dipartimento di Elettronica, Informatica e Bioingegneria, Politecnico di Milano Andrea Mauri, Dipartimento di Elettronica, Informatica e Bioingegneria, Politecnico di Milano