Overview

The Fifth International Workshop on Data Intensive Distributed Computing (DIDC 2012) will be held on June 19th, 2012 in conjunction with the 21st International Symposium on High Performance Distributed Computing (HPDC 2012), in Delft, Netherlands.

** DIDC 2012 workshop program is now available online **

The data needs of scientific as well as commercial applications from a diverse range of fields have been increasing exponentially over the recent years. This increase in the demand for large-scale data processing has necessitated collaboration and sharing of data collections among the world's leading education, research, and industrial institutions and use of distributed resources owned by collaborating parties. In a widely distributed environment, data is often not locally accessible and has thus to be remotely retrieved and stored. While traditional distributed systems work well for computation that requires limited data handling, they may fail in unexpected ways when the computation accesses, creates, and moves large amounts of data especially over wide-area networks. Further, data accessed and created is often poorly described, lacking both metadata and provenance. Scientists, researchers, and application developers are often forced to solve basic data-handling issues, such as physically locating data, how to access it, and/or how to move it to visualization and/or compute resources for further analysis.

This workshop will focus on the challenges imposed by data-intensive applications on distributed systems, and on the different state-of-the-art solutions proposed to overcome these challenges. It will bring together the collaborative and distributed computing community and the data management community in an effort to generate productive conversations on the planning, management, and scheduling of data handling tasks and data storage resources.

Topics of interest include, but are not limited to:

Data-intensive applications and their challenges
Data clouds, data grids, and data centers
New architectures for data-intenstive computing
Data virtualization, interoperability, and federation
Data-aware toolkits and middleware
Dynamic data-driven science
Data collection, provenance, and metadata
Network support for data-intensive computing
Remote and distributed visualization of large scale data
Data archives, digital libraries, and preservation
Service oriented architectures for data-intensive computing
Data privacy and protection in a collaborative environment
Peer-to-peer data movement and data streaming
Scientific breakthrough enabled by DIDC
Future research challenges in data-intensive computing