Clock synchronization is of utmost importance for applications running in a datacenter. It becomes critical for large scale distributed systems (e.g., databases--- RocksDB, Spanner, etc.), where each server epoch will determine the validity of a transaction. Additionally, clock synchronization is critical for fault tolerance protocols such as SMR (state machine replication) that relies on heartbeats to determine the health situation of a state. However, for servers in a datacenter, sources of clock inconsistency are numerous.
The principal source of inconsistency originates from the network devices (mainly switches) that interconnect the different servers. Indeed, a switch performs several tasks and its load continuously vary. Thus, network packet processing can be delayed within a non-determinist interval and lead to the delay of network packets delivery, necessary to ensure clock synchronization. Another source of inconsistency is the kernel network stack, which introduces significant delays for user-space applications that need to process the network packet to synchronize. Despite existing techniques such as NTP (Network Time Protocol) or PTP (Precision Time Protocol), it is hard to achieve a determinist protocol that ensures micro-second level and fault-tolerant clock synchronization.
The main aim of the PhD is to propose a low-cost, micro-second level, determinist, and fault-tolerant clock synchronization protocol for servers in a datacenter. Our key insight is to use a different path than the network path to be free from loads on network devices. Thus, our starting point is the state-of-the-art of the different usable paths, such as Bluetooth. The output of the state-of-the-art phase will be the different parameters of each alternative path, such as the additional hardware required, the cost, and operating system support. Based on the state-of-the-art result, we will proceed to design a synchronization protocol relying on the alternative path. Then, we will implement a prototype of the protocol, with an emphasis on efficient kernel drivers to leverage the new hardware and alternate path. We intend to test the resulting prototypes on simulated and real datacenter testbeds.
 Marcos K. Aguilera and Naama Ben-David and Rachid Guerraoui and Virendra J. Marathe and Athanasios Xygkis and Igor Zablotchi: Microsecond Consensus for Microsecond Applications: 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). 599--616
 Yuliang Li and Gautam Kumar and Hema Hariharan and Hassan Wassel and Peter Hochschild and Dave Platt and Simon Sabato and Minlan Yu and Nandita Dukkipati and Prashant Chandra and Amin Vahdat: Sundial: Fault-tolerant Clock Synchronization for Datacenters. 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). 1171--1186
 Yilong Geng and Shiyu Liu and Zi Yin and Ashish Naik and Balaji Prabhakar and Mendel Rosenblum and Amin Vahdat: Exploiting a Natural Network Effect for Scalable, Fine-grained Clock Synchronization. 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18). 81--94
 Ki Suh Lee, Han Wang, Vishal Shrivastav, and Hakim Weatherspoon. Globally Synchronized Time via Datacenter Networks. In Proceedings of the 2016 ACM SIGCOMM Conference (SIGCOMM '16). 454–467.