Ali FAHS, PhD student of the Myriads team, will defend his thesis on Wednesday, December the 16th at 9 am by videoconference from Petri Turing room (attendance limited to members of the jury)
You will be able to follow the defense on the YouTube: https://youtu.be/maBCu3aEnyQ
Although cloud computing offers a reliable solution for a wide range of applications, this paradigm presents limitations in fulfilling all the requirements of a family of emerging applications. Most notably, latency-sensitive applications require their requests to be returned within tight latency bounds. On the other hand, fog computing extends the cloud data centers with additional resources located in the vicinity of the end users. This enables latency-sensitive applications to process their requests without sending them to the cloud and, as a result, to receive a reply with very low network latency. Fog computing platforms are an aggregation of three layers: the end users’ devices located at the edge of the network, the edge layer composed of fog nodes which offer ultra-low latencies for the end users and, finally, the traditional data centers.
Providing low user-to-resource latency is one of the fundamental objectives of fog computing. This is achieved by carefully placing hardware and software resources at the edge of the network. Starting from a geo-distributed user base, the fog nodes are geo-distributed as well to grant each user resources in their immediate vicinity. Similarly, fog applications should place their replicas carefully so each user has accessto a nearby application replica.
This thesis addresses the specific needs of replicated service-oriented applications. These applications may create functionally-equivalent service replicas that are scattered across the fog nodes, allowing a consistently low user-to-replica latency.
Optimizing network latency in the edge layer for such application model requires one to implement proximity-aware mechanisms within a mature container orchestration engine. One needs to estimate the inter-node latencies between the fog nodes, route the end-users’ requests to nearby replicas, detect the sources of traffic and provide them with a nearby replica, update the placement of the replicas when the workload changes and, finally, scale the replica set to guarantee consistent performance at the lowest possible cost.
We targeted the challenge of proximity-aware resource management for replicated latency-sensitive service-oriented applications in order to control the tail user-perceived latency and to account for the workload non-stationarity in both time and space. This was done over the three levels of resource management: routing,placing, and autoscaling.
In the first contribution, we proposed Proxy-mity, a proximity-aware request routing plugin for Kubernetes. Proxy-mity is able to identify the nearby replicas using Vivaldi coordinates. It can then route requests to nearby replicas. Proxy-mity exposes a single variable α which allows system administrators to control the trade-off between proximity and load imbalance between replicas. The evaluations show the effectiveness of this system in lowering the average user-to-replica latency compared to the traditional load balancing mechanisms used by major cloud orchestration engines.
In the second contribution, we presented Hona, a set of algorithms for replica placement/re-placement. Hona uses Proxy-mity for routing requests and Vivaldi coordinates for estimating the inter-node latencies. It then implements periodic checks to detect the volumes and locations of the sources of traffic. Hona uses heuristics to find a replica placement that is capable of reducing the tail latency while preserving a good load balance between the replicas. Hona dynamically identifies changes in the workload characteristics and updates the placement to maintain performance if a QoS violation is detected. The evaluations show that the Hona heuristics are capable of finding a placement that respects the defined latency bound, and that there-placement algorithm can cope with a wide variety of changes in the workload.
In the third contribution, we designed Voilà, a tail-latency-aware autoscaler. Voilà relies on Vivaldi coordinates, Proxy-mity, and Hona’s periodic checks. Voilà’salgorithms dynamically control the number and placement of the replicas to reduce the tail latency, potential replica saturations, and the placement cost. The evaluations show that Voilà guarantees 98% of the requests are routed toward a nearby and non-overloaded replica. The system also scales well to much larger system sizes.
All of the presented contributions were implemented on top of Kubernetes and have been tested using a real testbed of RPi nodes. The aggregation of Proxy-mity, Hona’s periodic checks, and Voilà represents a complete proximity-aware solution for a mature cloud orchestration engine.
- Etienne Rivière, Professeur, Université catholique de Louvain - Reporter
- Erik Elmroth, Professeur, Umeå University - Examiner
- David Bromberg, Professeur, Université de Rennes 1- Examiner
- Shadi Ibrahim, Chargé de recherches, Inria - Examiner
- Guillaume Pierre, Professeur, Université de Rennes 1 - Thesis supervisor