High-Level Edge Computing Infrastructure for the IoT

Context

The Internet of Things (IoT) is a network that interconnects physical devices. Most of these devices are designed to interact with the physical world and fall under the categories of sensors (i.e., devices that collect data from the physical world) and actuators (i.e., devices that perform actions to alter the state of the physical world). IoT devices are also widely heterogeneous: while some devices are on the low end of the spectrum (e.g., a sensor built using a simple microcontroller), high-end devices can run full multithreaded operating systems (e.g., Raspberry Pi or BeagleBone platforms can run Linux or Android). The IoT landscape has experienced phenomenal growth – a recent study from Ericsson forecasts that there will be over 29 billion IoT devices by 2022.

IoT devices generate large amounts of data that needs to be processed to support decision making. In a traditional cloud-centric model, processing takes place in cloud data centers, based on the data collected by sensors. However, this cloud-centric approach has significant limitations for delay-sensitive and critical applications, such as high and unpredictable latencies, as well as the dependance on a high-bandwidth connection to a centralized remote infrastructure. Edge computing, an emergent paradigm, provides a model in which portions of the data processing is moved towards the devices themselves, and directly on them, in combination with resources provided by other layers (e.g., fog, cloud).

Research Projects

Multi-Dimensional Edge Adaptation and Services
MQTT Edge Communication Infrastructures
Distributed Stream Processing at the Edge
Interoperability of IoT/Edge Applications through the Web-of-Things
WebAssembly Code Migration for the IoT

1. Multi-Dimensional Edge Adaptation and Services

High-level IoT/edge applications are characterized various dimensions, such as (1) their computational requirements in terms of resources (e.g., processing power, memory), (2) their input and output data requirements (i.e., which data sources they need to access to perform their computations, and what data they are generating), and (3) their requirements in terms of network communications and connectivity (e.g., latency, bandwidth, jitter in relation to the network itself, and towards other components). For applications to execute in a reliable and performant manner, a system adaptation policy which considers these different dimensions together is of utmost importance.

High-Level Objective 1: come up with a multi-dimensional load balancing policy that can autonomously schedule the execution of high-level applications on edge and cloud devices. The following research questions will be addressed:

Where to deploy the different components of the high-level IoT/edge applications executing within the system?
Where to deploy the data read and written by the edge applications?
How to optimize the communication overlay to ensure that network constraints are met?
How should rebalancing be done when conditions change?

Further, most IoT/edge applications interact with the cloud and consume services provided by centralized cloud infrastructures. These services provide convenience and flexibility, and are offered across different models: Infrastructure as a Service (IaaS, which provides virtual machines and containers), Platform as a Service (PaaS, which provides developer-oriented services) and /Software as a Service (SaaS, which provides user-oriented services).

Given the computational power of recent edge devices, and the limitations of cloud-only models, some of these services cloud be offered in a hybrid model in the “cloud-edge”, and would benefit from the processing capabilities of recent IoT devices (e.g., Raspberry Pi), in combination with the cloud.

High-Level Objective 2: develop a set of high-level services (IaaS, PaaS, SaaS) that can be overlaid onto IoT and cloud devices, and benefit from the capabilities of each different device. Example of services include:

Infrastructure as a Service: micro-virtualization (deploying virtual machines and containers onto cloud and edge/IoT devices), isolating different applications running within the same context
Plaform as a Service: distributed file systems, no-SQL and SQL databases, communication overlays
Software as a Service: stream processing, Web-of-Things

2. Decoupled Edge Communication Infrastructures

MQTT / publish-subscribe is an efficient, yet flexible communication paradigm that allows for exchanging data in a decoupled manner between different applications, using message queue abstractions. MQTT enjoys widespread popularity in many contexts, and in particular in an IoT setting. MQTT brokers (which handle communication flows) are typically provided as a service in a centralized location (i.e., in the cloud). However, given the tight latency requirements of many IoT applications, a centralized model might not be the best option, as high latencies can be expected. A distributed edge-based model can be more suitable.

High-Level Objective: propose a distributed MQTT (publish/subscribe) infrastructure that can be overlaid onto the IoT/edge devices themselves as to benefit from the processing and networking capabilities of these devices, and compare against traditional cloud-centric approaches. Challenges include:

Handling the latency, jitter and bandwidth constraints of the devices and applications
Considering static and dynamic deployments (load balancing)
Exploring other flavors of publish/subscribe and their relevance the IoT landscape (e.g., content-based, graph-based)
Considering different well-used publish/subscribe protocols and middlewares and adapting their theoretical model to the requirements of IoT/edge applications (e.g., MQTT, XMPP, AMQP)

3. Distributed Stream Processing in the Edge

IoT devices and applications are becoming more and more popular. They produce a large amount of raw data that must be quickly processed and consumed with low delay. Processing steps on the data that is produced that support decision-making typically include operations such as filtering, aggregating, averaging, publishing, persisting, etc. Stream processing allows for mapping the various steps of the processing pipeline into Directed Acyclic Graph (DAG)-like abstractions: each step (operator) is represented by a node, and the edges model the flow of data between the different operators, which in turns eases the distributed deployment and the scalability of the different operators across different target devices.

Stream processing applications are typically deployed in centralized (e.g., cloud-based) environments. However, IoT devices and applications often require low latencies and might be not always be connected in a reliable manner to the cloud; therefore, we will be exploring how edge-based deployments of stream processing pipelines (i.e., directly on the IoT nodes that are part of the edge overlay) can increase the performance of IoT applications (e.g., decreasing latencies and costs , increasing throughput, etc.).

High-Level Objective: advance the state of the art in the area of distributed stream processing by proposing models and architectures that can be deployed onto the overlay of edge devices. Challenges include:

Handling the network-related constraints (latency, bandwidth, jitter), as well as the processing constraints (CPU, memory) of the IoT/edge devices
Considering the characteristics (e.g., processing time, selectivity ratio, semantics, replication) of the various operators
Considering the performance requirement of the various steps of the pipeline
Considering static and dynamic deployments

4. Interoperability of IoT/Edge Applications through the Web-of-Things

The Web of Things (WoT) aims at bridging the gap between the IoT and the Web by exposing the devices and applications, as well as the plethora of data, events and actions that they provide through open APIs and standards. To that end, open APIs such as the Mozilla Web Thing API can be used to standardize access to the data, events and actions exposed by the various components of high-level distributed IoT/edge applications.

High-Level Objective: leverage the Mozilla Web Thing API (or other similar open APIs) to model high-level abstractions in IoT/edge applications. Potential sub-projects include:

Developing a model for exposing data/actions/events produced by high-level IoT/edge applications through different hierarchical layers, given that different applications might be interested in different levels of granularity / aggregation.
Coming up with domain-specific languages for expressing the high-level data flows, actions and events of high-level IoT/edge applications.
Developing relevant use cases and applications.

5. WebAssembly Code Migration for the IoT

Given the limited processing capabilities of IoT/edge devices, these devices can quickly run out of resources. Further, considering our goal of executing high-level applications directly onto these devices, prompt adaptation is needed to prevent potential failures. In prior work, we developed a technique (ThingsMigrate) for migrating JavaScript applications between two JavaScript virtual machines (VMs) running on arbitrary IoT devices, with preservation of the state and without modifying the VM.

To make our migration approach more generic, it would be interesting to explore how migration can be supported for other languages. A potential approach would be to leverage WebAssembly (WASM), which provides a binary (compiled) language that can execute on different platforms. Given that several compilers can output WASM for different input languages, implementing migration of WASM code could open the door to supporting the migration of language-agnostic high-level IoT/edge applications.

High-Level Objective: develop techniques and algorithms for enabling the migration of generic WASM IoT applications between different WASM virtual machines on arbitrary devices, with preservation of the state.

Additional Information

Required Knowledge and Skills:

Undergraduate or Master’s degree in Computer Science, Computer Engineering, Software Engineering or related discipline, and very good academic grades.
Excellent programming skills. Experience with developing systems software (e.g., operating systems kernels, compilers, client-server applications, web applications, middleware, Linux/shell programming) is an asset.
Research experience in the form of internships, research projects and/or papers in international venues is an asset.
Ability to work independently and be self-driven, with a passion for research.

Programs of studies:

PhD (will be prioritized)
Master’s with thesis (M.Sc)
Master’s with project (M.Eng)

Funding

Competitive funding is available for PhD applicants with a strong track record. Master’s with Thesis (MSc) applicants with strong credentials will also be given consideration.

How to apply?

Please read and follow the instructions for prospective students. Please note that I do receive a large volume of emails so it might take a while before I get the chance to look at your application.

Julien Gascon-Samson, ÉTS Montréal

Research Projects for Grad Students