ACTURUS: Designing domain specific language to build large scale distributed systems

Submitted by Yerom David BROMBERG on
Team
Date of the beginning of the PhD (if already known)
des que possible
Place
Inria Rennes
Laboratory
IRISA - UMR 6074
Description of the subject

Context.

Modern distributed applications are becoming increasing large and complex. They often bring together independently developed sub- systems~(e.g. for storage, batch processing, streaming, application logic, logging, caching) into large, geo-distributed and heterogeneous architectures. Combining, configuring, and deploying these architectures is a difficult and multifaceted task: individual services have their own requirements, configuration spaces, programming models, distribution logic, which must be carefully tuned to insure the overall performance, resilience, and evolvability of the resulting system. This integration effort remains today largely an ad-hoc activity, that is either manual or uses tool-specific scripting capabilities. This low-level approach unfortunately scales poorly in the face of the increasingly complex deployment requirements and topologies of the involved services. For instance, Netflix leverages on Infrastructures as a Service (IaaS) clouds, such as Amazon Elastic Compute Cloud (EC2), to setup a complex software architecture to provision and instantiate on the fly the adequate numbers of Virtual Machines (VMs), dedicated to the transcoding of video streams depending on the incoming load. In particular, nowadays, companies such as Netflix, Twitter, Google setup micro-service architectures deployed across geo-distributed datacenters to schedule bench of interleaved tasks resulting in increasingly complex distributed applications, that can be hard to describe, monitor, and adapt [14-23].

Objectives: Easing the development of complex distributed systems has been a long-running and recurrent objective of middleware research. Most of these efforts have however focused on the local behavior of individual nodes (e.g.~with protocol kernels, or component frameworks), rather than on the programmatic means to describe a system's global structure and behavior. As a result, most of these programming frameworks offer little or no support for the flexible integration of individual systems into a larger whole. As a result, most existing approaches do not take into account that the resulting system may, in turn, serves as the constituent element of another larger system.

Consequently, our main objective in this thesis is to provide an assembly-based programming framework for the implementation of complex distributed application to be deployed in the cloud. Specifically, our aim is to provide a domain specific language (DSL) that exploits self-organizing overlays to map at runtime a developer's high-level description of a complex distributed application onto a concrete infrastructure. Our approach relies on the scalability, resilience, and adaptability of self-organizing overlays to maintain a developer's target distributed application in the face of failures, scaling and dynamic adaptations. Further, our approach goes beyond traditional framework for distributed systems in that itconsiders individual systems as collective distributed entities} enforcing a given internal structure (a star, a tree, a ring) which developers can assemble programmatically to realize more complex topologies. It also goes beyond existing self-organizing overlays by supporting the description of a target topology as a composition of more elementary shapes, breaking away from the monolithic design of typical self-organizing overlay protocols.

Bibliography

[14] Simon Bouget, Yérom-David Bromberg, Adrien Luxey, François Taïani: Pleiades: Distributed Structural Invariants at Scale. DSN 2018: 542-553

[15] B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. Katz, S. Shenker, and I. Stoica. Mesos: A Platform for Fine- grained Resource Sharing in the Data Center. In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation, NSDI’11. USENIX Association, 2011.

[16] Q. Huang, K. Birman, R. van Renesse, W. Lloyd, S. Kumar, and H. C. Li. An analysis of facebook photo caching. In SOSP, 2013.

[17] M. Jelasity, A. Montresor, and O. Babaoglu. T-Man: Gossip-based fast overlay topology construction. Computer Networks, 53(13), Aug. 2009.

[18] M. Jelasity, S. Voulgaris, R. Guerraoui, A.-M. Kermarrec, and M. Van Steen. Gossip-based peer sampling. ACM TOCS, 25(3):8, 2007.

[19] R. Kapitza, J. Domaschka, F. J. Hauck, H. P. Reiser, and H. Schmidt. Formi: Integrating adaptive fragmented objects into java rmi. IEEE Distributed Systems Online, 7(10), 2006.

[20] A.-M. Kermarrec, L. Massoulie, and A. Ganesh. Probabilistic reliable dissemination in large-scale systems. IEEE TPDS, 14(3), 2003.

[21] S. R. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong. Tinydb: an acquisitional query processing system for sensor networks. ACM Trans. Database Syst., 30(1):122–173, 2005.

[22] M. Makpangou, Y. Gourhant, J.-P. Le Narzul, and M. Shapiro. Frag- mented objects for distributed abstractions. In Readings in Distributed Computing Systems. July 1994.

[23] M. Mamei and F. Zambonelli. Programming pervasive and mobile computing applications: the tota approach. ACM TSEM, 2009.

Researchers

Lastname, Firstname
BROMBERG DAVID
Type of supervision
Director
Laboratory
IRISA

Lastname, Firstname
COMBEMALE BENOIT
Type of supervision
Co-director (optional)
Laboratory
IRISA
Contact·s
Keywords
Distribued systems, Micro services, cloud, Netflix, virtual machines