- Submission Deadline:
June 9, 2014
May 30, 2014
- Notification of Acceptance:
July 4, 2014
- Early Registration:
July 25, 2014
- Workshop Date:
August 25, 2014
- Camera Ready Manuscript:
October 3, 2014
- 23/7/2014: Workshop program online
- 28/5/2014: Following author requests, the submissin deadline was extended
- 10/4/2014: Program Committee finalized
- 20/3/2014: EasyChair submission page online
3rd Workshop on Big Data Management in Clouds
The third edition of the Workshop on Big Data Management in Clouds will be held in Porto, Portugal. BigDataCloud 2014 follows the successful previous editions held in conjunction with EuroPar. Its goal is to aggregate the data management and Clouds / Grids / P2P communities in order to complement the Big Data handling issues with a comprehensive system / infrastructure perspective.
Workshop Program [top]
The workshop will take place in Room 3
14h00 - 16h00 Session 1, Chair: Frédéric Desprez (Inria / LIP ENS Lyon)
- 14:00 Frédéric Desprez, Alexandru Costan. Opening
- 14:15 Invited Speaker: Toni Cortes Data Sharing in the Big-data era
- 15:00 Reginald Cushing, Adam Belloum, Marian Bubak, Cees De Laat (University of Amsterdam). Automata-based Dynamic Data Processing for Clouds
- 15:30 Ji Liu, Esther Pacitti, Patrick Valduriez (Inria / LIRMM Montpellier), Vítor Silva Sousa, Marta Mattoso (COPPE/UFRJ Rio de Janeiro). Scientific Workflow Partitioning in Multi-site Clouds
16h00 - 16h30 Coffee Break
16h30 - 17h45 Session 2 , Chair: Alexandru Costan (Inria / INSA Rennes)
- 16:30 Sylvain Gault, Christian Pérez (Inria / ENS Lyon). Dynamic Scheduling of MapReduce Shuffle Under Bandwidth Constraints
- 17:00 Emanuele Carlini, Patrizio Dazzi (ISTI-CNR) Andrea Esposito, Alessandro Lulli, Laura Ricci (University of Pisa). Balanced Graph Partitioning with Apache Spark
- 17:30 Frédéric Desprez, Alexandru Costan. Concluding Remarks
Invited Talk [top]
Dr. Toni Cortes, Barcelona Supercomputing Center.
Data Sharing in the Big-data era
Toni Cortes is the manager of the storage-system group at the BSC (since 2006) and is also an associate professor at Universitat Politècnica de Catalunya (since 1998). Since 1992, Toni as been teaching operating system and computer architecture courses at the Barcelona school of informatics (UPC) and from 2000 to 2004 he also served as vicedean for international affair at the same school.
His research concentrates in storage systems, programming models for scalable distributed systems and operating systems. He is also editor of the Cluster Computing Journal and the coordinator of the SSI task in the IEEE TCSS. He has served in many international conference program committees and/or organizing committees and was general chair for the Cluster 2006 conference, LaSCo 2008, XtreemOS summit 2009, and SNAPI 2010. He is also, since 2011, the chair of the steering committee for the Cluster conference series. His involvement in IEEE CS has been awarded by the "Certificate of appreciation" in 2007.
In this talk, Toni will focus on the value of big data, which comes from the possibility of extracting information from large amounts of raw data. And, as in real life, the most valuable information comes from the merging of shared information from different sources. Unfortunately, current sharing mechanisms are either too restrictive and thus not flexible enough, or the data provider losses control over its asset (its data). This limitation prevents data owners and potential service designers from taking advantage of the available data. In this talk we will introduce the idea of self-contained objects and how 3rd-party enrichment of such objects can offer an environment where the data providers keep full control over its data while service designers get the maximum flexibility.
Workshop Description [top]
As data volumes increase at exponential speed in more and more application fields of science, the challenges posed by handling Big Data gain an increasing importance. Large scientific experiments, such as climate modelling, genome mapping, and high-energy physics simulations generate data volumes reaching petabytes per year, further used for real-time or offline processing. Initially designed for powerful and expensive supercomputers, such applications have seen an increasing adoption on clouds, exploiting their elasticity and economical model.
However, running such applications in an efficient fashion on clouds is challenging. One such open challenge is how to handle this “data deluge”. Sharing, disseminating and analyzing large data sets has become a critical issue despite the deployment of petascale computing systems, and optical networking speeds reaching up to 100 Gbps. While Map/Reduce covers a large fraction of the development space, there are still many applications that are better served by other models and systems. In such a context, we need to embrace new programming models, scheduling schemes, hybrid infrastructures and scale out of single datacenters to geographically distributed deployments in order to cope with these new challenges effectively.
The BigDataCloud workshop provides a platform for the dissemination of recent research efforts that explicitly aim at addressing these challenges. It supports the presentation of advanced solutions for the efficient management of Big Data in the context of Cloud computing, new development and deployment efforts in running data-intensive computing workloads. In particular, we are interested in how the use of Cloud-based technologies can meet the data intensive scientific challenges of HPC applications that are not well served by the current supercomputers or grids, and are being ported to Cloud platforms. The goal of the workshop is to support the assessment of the current state, introduce future directions, and present architectures and services for future Clouds supporting data intensive computing.
Call for Papers [top]
Workshop Topics [top]
The BigDataCloud workshop calls for contributions that address fundamental research and system issues in Cloud data management including but not limited to the following:
- Cloud storage architectures for Big Data
- Reliability of data intensive applications and services running on the Cloud
- Query processing and indexing in Cloud computing systems
- Data privacy and security in Clouds
- Data-intensive computing on hybrid infrastructures (Grids/Clouds/P2P)
- Cloud storage resource management
- Data-intensive Cloud-based applications
- Content delivery networks using storage Clouds
- Data intensive scalable computing on Clouds
- Data management within and across multiple geographically distributed data centers
- Data handling in MapReduce based computations
- Data management in HPC Clouds
- Advanced programming models for IaaS, PaaS and SaaS
- Elasticity for Cloud data management systems
- Self-* and adaptive mechanisms.
- Many-Task Computing in the Cloud
- Performance evaluation of Cloud environments and technologies
- Event streaming and real-time processing on Clouds
- Energy-efficiency for BigData in Clouds
Organizing Commitee [top]
Alexandru Costan, Inria Rennes - Bretagne Atlantique, France
Frédéric Desprez, Inria / ENS Lyon, France
Gabriel Antoniu, Inria, France
Luc Bougé, ENS Rennes, France
Toni Cortes, Barcelona Supercomputing Center, Spain
Kate Keahey, University of Chicago / Argonne National Laboratory, USA
Dries Kimpe, Argonne National Laboratory, USA
Olivier Nano, Microsoft Research ATLE, Germany
Bogdan Nicolae, IBM Research, Ireland
Maria S. Pérez, Universidad Politecnica De Madrid, Spain
Leonardo Querzoni, University of Rome La Sapienza, Italy
Domenico Talia, University of Calabria, Italy
Osamu Tatebe, University of Tsukuba, Japan
Cristian Zamfir, EPFL, Switzerland
Submission Guidelines [top]
Authors are invited to submit research and application papers not exceeding 12 pages following the Springer LNCS format. You can download LNCS Latex style here
We solicit the submission of academic workshop papers representing original, previously unpublished work. Submitted papers will be carefully evaluated based on originality, significance, technical soundness and clarity of exposition. Papers should be prepared as the .pdf files and submitted electronically to the BigDataCloud 2014 online submission system. Submission of the paper implies that should the paper be accepted, at least one of the authors must register and present the paper at the workshop.
Accepted papers that are presented at the workshop, will be published in a revised form in a special Euro-Par Workshop Volume in the Lecture Notes in Computer Science (LNCS) series after the Euro-Par conference.