Important Dates

Submission Deadline:
June 9, 2014
~~May 30, 2014~~
Notification of Acceptance:
July 4, 2014
Early Registration:
July 25, 2014
Workshop Date:
August 25, 2014
Camera Ready Manuscript:
October 3, 2014

News

23/7/2014: Workshop program online
28/5/2014: Following author requests, the submissin deadline was extended
10/4/2014: Program Committee finalized
20/3/2014: EasyChair submission page online

3^rd Workshop on Big Data Management in Clouds

in conjunction with Euro-Par 2014

Welcome [top]

The third edition of the Workshop on Big Data Management in Clouds will be held in Porto, Portugal. BigDataCloud 2014 follows the successful previous editions held in conjunction with EuroPar. Its goal is to aggregate the data management and Clouds / Grids / P2P communities in order to complement the Big Data handling issues with a comprehensive system / infrastructure perspective.

Workshop Program [top]

The workshop will take place in Room 3.

14h00 - 16h00 Session 1, Chair: Frédéric Desprez (Inria / LIP ENS Lyon)

14:00 Frédéric Desprez, Alexandru Costan. Opening
14:15 Invited Speaker: Toni Cortes Data Sharing in the Big-data era
15:00 Reginald Cushing, Adam Belloum, Marian Bubak, Cees De Laat (University of Amsterdam). Automata-based Dynamic Data Processing for Clouds
15:30 Ji Liu, Esther Pacitti, Patrick Valduriez (Inria / LIRMM Montpellier), Vítor Silva Sousa, Marta Mattoso (COPPE/UFRJ Rio de Janeiro). Scientific Workflow Partitioning in Multi-site Clouds

16h00 - 16h30 Coffee Break

16h30 - 17h45 Session 2 , Chair: Alexandru Costan (Inria / INSA Rennes)

16:30 Sylvain Gault, Christian Pérez (Inria / ENS Lyon). Dynamic Scheduling of MapReduce Shuffle Under Bandwidth Constraints
17:00 Emanuele Carlini, Patrizio Dazzi (ISTI-CNR) Andrea Esposito, Alessandro Lulli, Laura Ricci (University of Pisa). Balanced Graph Partitioning with Apache Spark
17:30 Frédéric Desprez, Alexandru Costan. Concluding Remarks

Invited Talk [top]

Dr. Toni Cortes, Barcelona Supercomputing Center.

Data Sharing in the Big-data era

Toni Cortes is the manager of the storage-system group at the BSC (since 2006) and is also an associate professor at Universitat Politècnica de Catalunya (since 1998). Since 1992, Toni as been teaching operating system and computer architecture courses at the Barcelona school of informatics (UPC) and from 2000 to 2004 he also served as vicedean for international affair at the same school. His research concentrates in storage systems, programming models for scalable distributed systems and operating systems. He is also editor of the Cluster Computing Journal and the coordinator of the SSI task in the IEEE TCSS. He has served in many international conference program committees and/or organizing committees and was general chair for the Cluster 2006 conference, LaSCo 2008, XtreemOS summit 2009, and SNAPI 2010. He is also, since 2011, the chair of the steering committee for the Cluster conference series. His involvement in IEEE CS has been awarded by the "Certificate of appreciation" in 2007.

In this talk, Toni will focus on the value of big data, which comes from the possibility of extracting information from large amounts of raw data. And, as in real life, the most valuable information comes from the merging of shared information from different sources. Unfortunately, current sharing mechanisms are either too restrictive and thus not flexible enough, or the data provider losses control over its asset (its data). This limitation prevents data owners and potential service designers from taking advantage of the available data. In this talk we will introduce the idea of self-contained objects and how 3rd-party enrichment of such objects can offer an environment where the data providers keep full control over its data while service designers get the maximum flexibility.

Workshop Description [top]

As data volumes increase at exponential speed in more and more application fields of science, the challenges posed by handling Big Data gain an increasing importance. Large scientific experiments, such as climate modelling, genome mapping, and high-energy physics simulations generate data volumes reaching petabytes per year, further used for real-time or offline processing. Initially designed for powerful and expensive supercomputers, such applications have seen an increasing adoption on clouds, exploiting their elasticity and economical model.

However, running such applications in an efficient fashion on clouds is challenging. One such open challenge is how to handle this “data deluge”. Sharing, disseminating and analyzing large data sets has become a critical issue despite the deployment of petascale computing systems, and optical networking speeds reaching up to 100 Gbps. While Map/Reduce covers a large fraction of the development space, there are still many applications that are better served by other models and systems. In such a context, we need to embrace new programming models, scheduling schemes, hybrid infrastructures and scale out of single datacenters to geographically distributed deployments in order to cope with these new challenges effectively.

The BigDataCloud workshop provides a platform for the dissemination of recent research efforts that explicitly aim at addressing these challenges. It supports the presentation of advanced solutions for the efficient management of Big Data in the context of Cloud computing, new development and deployment efforts in running data-intensive computing workloads. In particular, we are interested in how the use of Cloud-based technologies can meet the data intensive scientific challenges of HPC applications that are not well served by the current supercomputers or grids, and are being ported to Cloud platforms. The goal of the workshop is to support the assessment of the current state, introduce future directions, and present architectures and services for future Clouds supporting data intensive computing.

Call for Papers [top]

Formats: PDF

Workshop Topics [top]

The BigDataCloud workshop calls for contributions that address fundamental research and system issues in Cloud data management including but not limited to the following:

Cloud storage architectures for Big Data
Reliability of data intensive applications and services running on the Cloud
Query processing and indexing in Cloud computing systems
Data privacy and security in Clouds
Data-intensive computing on hybrid infrastructures (Grids/Clouds/P2P)
Cloud storage resource management
Data-intensive Cloud-based applications
Content delivery networks using storage Clouds
Data intensive scalable computing on Clouds
Data management within and across multiple geographically distributed data centers
Data handling in MapReduce based computations
Data management in HPC Clouds
Advanced programming models for IaaS, PaaS and SaaS
Elasticity for Cloud data management systems
Self-* and adaptive mechanisms.
Many-Task Computing in the Cloud
Performance evaluation of Cloud environments and technologies
Event streaming and real-time processing on Clouds
Energy-efficiency for BigData in Clouds

Organizing Commitee [top]

Workshop Co-Chairs

Alexandru Costan, Inria Rennes - Bretagne Atlantique, France
Frédéric Desprez, Inria / ENS Lyon, France

Program Committee

Gabriel Antoniu, Inria, France
Luc Bougé, ENS Rennes, France
Toni Cortes, Barcelona Supercomputing Center, Spain
Kate Keahey, University of Chicago / Argonne National Laboratory, USA
Dries Kimpe, Argonne National Laboratory, USA
Olivier Nano, Microsoft Research ATLE, Germany
Bogdan Nicolae, IBM Research, Ireland
Maria S. Pérez, Universidad Politecnica De Madrid, Spain
Leonardo Querzoni, University of Rome La Sapienza, Italy
Domenico Talia, University of Calabria, Italy
Osamu Tatebe, University of Tsukuba, Japan
Cristian Zamfir, EPFL, Switzerland

Submission Guidelines [top]

Authors are invited to submit research and application papers not exceeding 12 pages following the Springer LNCS format. You can download LNCS Latex style here

We solicit the submission of academic workshop papers representing original, previously unpublished work. Submitted papers will be carefully evaluated based on originality, significance, technical soundness and clarity of exposition. Papers should be prepared as the .pdf files and submitted electronically to the BigDataCloud 2014 online submission system. Submission of the paper implies that should the paper be accepted, at least one of the authors must register and present the paper at the workshop.

Submission: EuroPar 2014 WorkShops - EasyChair (BigDataCloud Track)

Accepted papers that are presented at the workshop, will be published in a revised form in a special Euro-Par Workshop Volume in the Lecture Notes in Computer Science (LNCS) series after the Euro-Par conference.

Previous Workshops [top]

BigDataCloud 2013
BigDataCloud 2012
CGWS 2012
CGWS 2011
CGWS 2010
CGWS 2009