Tag Archives: conference

Linearize, predict and place: minimizing the makespan for edge-based stream processing of directed acyclic graphs

Khare, S., Sun, H., Gascon-Samson, J., Zhang, K., Gokhale, A., Barve, Y., Bhattacharjee, A. and Koutsoukos, X. (2019). Linearize, predict and place: minimizing the makespan for edge-based stream processing of directed acyclic graphs. Symposium of Edge Computing (SEC) 2019
[Preprint]

Abstract: Many IoT applications found in cyber-physical systems, such as smart grids, must take control actions in response to critical events, such as supply-demand mismatch, which requires low-latency processing of streaming data for rapid event detection and anomaly remediation. These streaming applications generally take the form of directed acyclic graphs (DAGs), where vertices represent operators and edges represent the flow of data between these operators. Edge computing has recently attracted significant attention as a means to readily meet the requirements of latency-critical IoT applications due to its ability to provide low-latency processing near the source of data. To accrue the benefits of edge computing, the constituent operators of these applications must be placed in a manner that intelligently trades-off inter-operator communication costs with the cost of interference incurred due to co-location of operators on the same resource-constrained edge devices. To address these challenges and to substantially simplify the placement problem for DAGs of arbitrary sizes and topologies, we present an algorithm that first transforms any arbitrary stream processing DAG into an approximate set of linear chains. Subsequently, a data-driven latency prediction model for co-located linear chains is used to inform the placement of operators such that the makespan, defined as the maximum latency of all paths in the DAG, is minimized. We empirically evaluate our algorithm using a variety of DAG placement scenarios on a Beagle Bone cluster, which is representative of an edge computing environment.

Failure Prediction in the Internet of Things due to Memory Exhaustion

Rafiuzzaman M., Gascon-Samson J., Pattabiraman K., Gopalakrishnan S. (2019) Failure Prediction in the Internet of Things due to Memory Exhaustion. 34th ACM Symposium on Applied Computing (SAC 2019), Limassol, Cyprus
> Acceptance ratio: 27.5% [Preprint] [Presentation Slides]

Abstract: We present a technique to predict failures resulting from memory exhaustion in devices built for the modern Internet of Things (IoT). These devices can run general-purpose applications on the network edge for local data processing to reduce latency, bandwidth and infrastructure costs, and to address data safety and privacy concerns. Applications are, however, not optimized for all devices and could result in sudden and unexpected memory exhaustion failures because of limited available memory on those IoT devices. Proactive prediction of such failures, with sufficient lead time, allows for adaptation of the application or its safe termination. Our memory failure prediction technique for applications running on IoT devices uses k-Nearest-Neighbor (kNN) based machine learning models. We have evaluated our technique using two third-party applications and a real-world IoT simulation application on two different IoT platforms and on an Amazon EC2 t2.micro instance for both single and multitenancy use cases. Our results indicate that our technique significantly outperforms simpler threshold-based techniques: in our test applications, with 180 seconds of lead time, failures were accurately predicted with 88% recall at 74% precision for a single application failure and 76% recall at 71% precision for multitenancy failure.

Scalable Edge Computing for Low Latency Data Dissemination in Topic-Based Publish/Subscribe

Khare, S., Sun, H., Zhang, K., Gascon-Samson, J., Gokhale, A., Koutsoukos, K., Abdelaziz H. (2018) Scalable Edge Computing for Low Latency Data Dissemination in Topic-Based Publish/Subscribe, 2018 IEEE/ACM Symposium on Edge Computing (SEC 2018), Seattle, WA, USA
[Preprint] [Presentation Slides]

Abstract: Advances in Internet of Things (IoT) give rise to a variety of latency-sensitive, closed-loop applications that reside at the edge. These applications often involve a large number of sensors that generate volumes of data, which must be processed and disseminated in real-time to potentially a large number of entities for actuation, thereby forming a closed-loop, publish-process-subscribe system. To meet the response time requirements of such applications, this paper presents techniques to realize a scalable, fog/edge-based broker architecture that balances data publication and processing loads for topic-based, publish-process-subscribe systems operating at the edge, and assures the Quality-of-Service (QoS), specified as the 90th percentile latency, on a per-topic basis. The key contributions include: (a) a sensitivity analysis to understand the impact of features such as publishing rate, number of subscribers, per-sample processing interval and background load on a topic’s performance; (b) a latency prediction model for a set of co-located topics, which is then used for the latency-aware placement of topics on brokers; and (c) an optimization problem formulation for k-topic co-location to minimize the number of brokers while meeting each topic’s QoS requirement. Here, k denotes the maximum number of topics that can be placed on a broker. We show that the problem is NP-hard for k >=3 and present three load balancing heuristics. Empirical results are presented to validate the latency prediction model and to evaluate the performance of the proposed heuristics.

ThingsMigrate: Platform-Independent Migration of Stateful JavaScript IoT Applications

Gascon-Samson, J., Jung, K., Goyal, S., Rezaiean-Asel, A., Pattabiraman, K. (2018) ThingsMigrate: Platform-Independent Migration of Stateful JavaScript IoT Applications, ECOOP 2018, Amsterdam, Netherlands [Preprint] [Presentation Slides] [Poster]

Abstract: The Internet of Things (IoT) has gained wide popularity both in academic and industrial contexts. As IoT devices become increasingly powerful, they can run more and more complex applications written in higher-level languages, such as JavaScript. However, by their nature, IoT devices are subject to resource constraints, which require applications to be dynamically migrated between devices (and the cloud). Further, IoT applications are also becoming more stateful, and hence we need to save their state during migration transparently to the programmer. In this paper, we present ThingsMigrate, a middleware providing VM-independent migration of stateful JavaScript applications across IoT devices. ThingsMigrate captures and reconstructs the internal JavaScript program state by instrumenting application code before run time, without modifying the underlying Virtual Machine (VM), thus providing platform and VM-independence. We evaluated ThingsMigrate against standard benchmarks, and over two IoT platforms and a cloud-like environment. We show that it can successfully migrate even highly CPU-intensive applications, with acceptable overheads (about 30%), and supports multiple migrations.

ARTINALI: Dynamic Invariant Detection for Cyber-Physical System Security

Aliabadi, M., Kamath, A., Gascon-Samson, J., Pattabiraman, K. (2017) ARTINALI: Dynamic Invariant Detection for Cyber-Physical System Security, accepted / to be presented at ESEC/FSE 2017, Paderborn, Germany
> Acceptance ratio: 24% [Preprint] [Presentation Slides]

Abstract: Cyber-Physical Systems (CPSes) are being widely deployed in security critical scenarios such as smart homes and medical devices. Unfortunately, the connectedness of these systems and their relative lack of security measures makes them ripe targets for attacks. Specification-based Intrusion Detection Systems (IDS) have been shown to be effective for securing CPSs. Unfortunately, deriving invariants for capturing the specifications of CPS systems is a tedious and error-prone process. Therefore, it is important to dynamically monitor the CPS system to learn its common behaviors and formulate invariants for detecting security attacks. Existing techniques for invariant mining only incorporate data and events, but not time. However, time is central to most CPS systems, and hence incorporating time in addition to data and events, is essential for achieving low false positives and false negatives. This paper proposes ARTINALI, which mines dynamic system properties by incorporating time as a first-class property of the system. We build ARTINALI-based Intrusion Detection Systems (IDSes) for two CPSes, namely smart meters and smart medical devices, and measure their efficacy. We find that the ARTINALI-based IDSes significantly reduce the ratio of false positives and false negatives by 16 to 48% (average 30.75%) and 89 to 95% (average 93.4%) respectively over other dynamic invariant detection tools.

MultiPub: Latency and Cost-Aware Global-Scale Cloud Publish/Subscribe

Gascon-Samson, J., Kemme, B., Kienzle, J. (2017) MultiPub: Latency and Cost-Aware Global-Scale Cloud Publish/Subscribe, ICDCS 2017, Atlanta, USA [Preprint] [Presentation Slides]

Abstract: Topic-based pub/sub is a widely used communication mechanism in distributed systems for targeted information dissemination between loosely coupled entities. To scale dynamically depending on the current communication demands, pub/services can be conveniently deployed in the cloud. To provide fast dissemination, the service can be distributed across multiple cloud regions. The architectural design and run-time deployment of such a middleware is tricky, though, as it can have a significant effect on communication latency and cloud-based cost. In this paper, we propose MultiPub, a flexible pub/sub middleware for latency-constrained, world-wide distributed applications that dynamically reconfigures the communication layer to ensure a predefined maximum latency for publication dissemination while minimizing cloud-based costs. This is achieved by routing publications either through a single or across multiple cloud regions. We demonstrate the effectiveness of MultiPub by presenting a set of experiments that report on the achieved communication latency and cost savings compared to traditional approaches, as well as a performance evaluation.

Dynamoth: A Scalable Pub/Sub Middleware for Latency-Constrained Applications in the Cloud

Gascon-Samson, J., Garcia, F.-P., Kemme, B., Kienzle, J. (2015) Dynamoth: A Scalable Pub/Sub Middleware for Latency-Constrained Applications in the Cloud, ICDCS 2015, Columbus, USA
> Acceptance ratio: 12.8% [Preprint] [Presentation Slides]

Abstract: This paper presents Dynamoth, a dynamic, scalable, channel-based pub/sub middleware targeted at large scale, distributed and latency constrained systems. Our approach provides a software layer that balances the load generated by a high number of publishers, subscribers and messages across multiple, standard pub/sub servers that can be deployed in the Cloud. In order to optimize Cloud infrastructure usage, pub/sub servers can be added or removed as needed. Balancing takes into account the live characteristics of each channel and is done in an hierarchical manner across channels (macro) as well as within individual channels (micro) to maintain acceptable performance and low latencies despite highly varying conditions. Load monitoring is performed in an unintrusive way, and rebalancing employs a lazy approach in order to minimize its temporal impact on performance while ensuring successful and timely delivery of all messages. Extensive real-world experiments that illustrate the practicality of the approach within a massively multiplayer game setting are presented. Results indicate that with a given number of servers, Dynamoth was able to handle 60% more simultaneous clients than the consistent hashing approach, and that it was properly able to deal with highly varying conditions in the context of large workloads.

Monitoring Large-Scale Location-Based Information Systems

Khan, H., Gascon-Samson, J., Kienzle, J., Kemme, B. (2015) Monitoring Large-Scale Location-Based Information Systems, IPDPS 2015, Hyderabad, India
> Acceptance ratio: 22% [Preprint]

Abstract: Monitoring the state of a distributed virtual world is challenging for several reasons: 1) the distributed information must be gathered in real-time without affecting the performance of the information system, 2) in large-scale systems it is impossible for a single node to collect and process all the data, 3) the vast information must be filtered and aggregated according to what the human observer wants to focus on, and 4) the point of interest of the observer can change frequently. In this paper we present and evaluate a non-intrusive monitoring middleware that addresses these challenges by dynamically partitioning the geographic map (e.g., of the virtual world or the game) in terms of map objects and (expected) state changes. We assign a different collector node to each of these partitions to collect and pre-process the data, and forward it to a central monitoring node. Furthermore, we provide mechanisms to efficiently filter and aggregate location changes, the pre-dominant changes in location-based information systems. We describe a specific monitoring setup that takes advantage of the replication model that is common in many virtual worlds and multiplayer games to collect the data. Finally, we present extensive performance results that show the trade-offs between scalability, precision, and real-time performance.

Watchmen: Scalable Cheat-Resistant Support for Distributed Multi-player Online Games

Yahyavi, A., Huguenin, K., Gascon-Samson, J., Kienzle, J., Kemme, B. (2013) Watchmen: Scalable Cheat-Resistant Support for Distributed Multi-player Online Games, ICDCS 2013, Philadelphia, USA
> Acceptance ratio: 13% [Preprint]

Abstract: Multi-player online games are inherently distributed applications, and a wide range of distributed architectures have been proposed. However, only few successful commercial systems follow such approaches, even given their benefits, due to one main hurdle: the easiness with which cheaters can disrupt the game state computation and dissemination, perform illegal actions, or unduly gain access to sensitive information. The challenge is that any measures used to address cheating must meet the heavy scalability and tight latency requirements of fast paced games. We propose Watchmen, the first distributed scalable protocol designed with cheat detection and prevention in mind that supports fast paced games. It is based on a randomized dynamic proxy scheme for both the dissemination and verification of actions. Furthermore, Watchmen reduces the information exposed to players close to the minimum required to render the game. We build our proof-of-concept prototype on top of Quake III. We show that Watchmen, while scaling to hundreds of players and meeting the tight latency requirements of first person shooter games, is able to significantly reduce opportunities to cheat, even in the presence of collusion.