stream processing | Julien Gascon-Samson, ÉTS Montréal

Mohtadi, A., Gascon-Samson, J. (2020) Poster: Dependency-Aware Operator Placement of Distributed Stream Processing IoT Applications Deployed at the Edge. Symposium of Edge Computing (SEC) 2020 Demos and Posters
[Preprints] [Poster as a presentation]

Abstract: In the last few years, the number of IoT applications that rely on stream processing has increased significantly. These applications process continuous streams of data with a low delay and provide valuable information. To meet the stringent latency requirements and the need for real-time results that they require, the components of the stream processing pipeline can be deployed directly onto the edge layer to benefit from the resources and capabilities that the swarm of edge devices can provide. In this poster, we outline some ongoing research ideas into deploying stream processing operators onto edge nodes, with the goal of minimizing latency while ensuring that the constraints of the devices and their network capabilities are respected. More precisely, we provide a modeling of the semantics of the operators that considers the interactions between different operators, the parallelism of concurrent operators, as well as the latency and bandwidth usage.

Khare, S., Sun, H., Gascon-Samson, J., Zhang, K., Gokhale, A., Barve, Y., Bhattacharjee, A. and Koutsoukos, X. (2019). Linearize, predict and place: minimizing the makespan for edge-based stream processing of directed acyclic graphs. Symposium of Edge Computing (SEC) 2019
[Preprint]

Abstract: Many IoT applications found in cyber-physical systems, such as smart grids, must take control actions in response to critical events, such as supply-demand mismatch, which requires low-latency processing of streaming data for rapid event detection and anomaly remediation. These streaming applications generally take the form of directed acyclic graphs (DAGs), where vertices represent operators and edges represent the flow of data between these operators. Edge computing has recently attracted significant attention as a means to readily meet the requirements of latency-critical IoT applications due to its ability to provide low-latency processing near the source of data. To accrue the benefits of edge computing, the constituent operators of these applications must be placed in a manner that intelligently trades-off inter-operator communication costs with the cost of interference incurred due to co-location of operators on the same resource-constrained edge devices. To address these challenges and to substantially simplify the placement problem for DAGs of arbitrary sizes and topologies, we present an algorithm that first transforms any arbitrary stream processing DAG into an approximate set of linear chains. Subsequently, a data-driven latency prediction model for co-located linear chains is used to inform the placement of operators such that the makespan, defined as the maximum latency of all paths in the DAG, is minimized. We empirically evaluate our algorithm using a variety of DAG placement scenarios on a Beagle Bone cluster, which is representative of an edge computing environment.