You are here

Efficient Big Data Processing on Large-Scale Shared Platforms: Managing I/Os and Failures

You are kindly invited to the PhD thesis defense of Orçun YILDIZ (KerData team) that will take place on Friday, December 8th at 9:30 at the "Salle Métivier" at Inria, and for a drink in Sein meeting room at 5 pm.


As of 2017, we live in a data-driven world where data-intensive applications are bringing fundamental improvements to our lives in many different areas such as business, science, health care and security. This has boosted the growth of the data volumes (i.e., deluge of Big Data). To extract useful information from this huge amount of data, different data processing frameworks have been emerging such as MapReduce, Hadoop, and Spark. Traditionally, these frameworks run on large-scale platforms (i.e., HPC systems and clouds) to leverage their computation and storage power. Usually, these large-scale platforms are used concurrently by multiple users and multiple applications with the goal of better utilization of resources. Though benefits of sharing these platforms exist, several challenges are raised when sharing these large-scale platforms, among which I/O and failure management are the major ones that can impact efficient data processing.

To this end, we first focus on the I/O related performance bottlenecks for Big Data applications on HPC systems. We start by characterizing the performance of Big Data applications on these systems. We identify I/O interference and latency as the major performance bottlenecks. Next, we zoom in on I/O interference problem to further understand the root causes of this phenomenon. Then, we propose an I/O management scheme to mitigate the high latencies that Big Data applications may encounter on HPC systems. Moreover, we introduce interference models for Big Data and HPC applications based on the findings we obtain in our experimental study regarding the root causes of I/O interference. Finally, we leverage these models to minimize the impact of interference on the performance of Big Data and HPC applications. Second, we focus on the impact of failures on the performance of Big Data applications by studying failure handling in shared MapReduce clusters. We introduce a failure-aware scheduler which enables fast failure recovery while optimizing data locality thus improving the application performance.

Orçun YILDIZ (KerData)
Friday, 8. December 2017 - 9:30
Salle Métivier
Defense Type: 
Composition of jury: 
  • Sébastian Monnet, Professor, University Savoie Mont Blanc, France
  • Olivier Beaumont, Senior Researcher, Inria Bordeaux, France
  • François Taïani, Professor, University Rennes 1, France
  • Luciana Arantes, Associate Professor, University Pierre et Marie Curie, France
  • Shadi Ibrahim, Researcher, Inria Rennes, France
  • Gabriel Antoniu, Senior Researcher, Inria Rennes, France