CSE 726 – Seminar on ÒData Intensive Distributed ComputingÓ

Spring 2011 – Papers to be discussed

 

Parallel Cluster File Systems:

 

1.     GPFS: A Shared-Disk File System for Large Computing Clusters, F. Schmuck et al., FAST 2002.

2.     PVFS: A Parallel File System for Linux Clusters, P. Carns et al., Linux Conf. 2000.

3.     Black-Box Problem Diagnosis in Parallel File Systems, M. Kasick et al., FAST 2010.

4.     Panache: A Parallel File System Cache for Global File Access, M. Eshel et al., FAST 2010.

 

Wide Area Distributed File Systems:

 

5.     Availability in Globally Distributed Storage Systems, D. Ford et al., OSDI 2010.

6.     Safety, Visibility, and Performance in a Wide-Area File System, M. Kim et al., FAST 2002.

7.     Integrating Portable and Distributed Storage, N. Tolia et al., FAST 2003.

8.     The Google File System, S. Ghemewat., SOSP 2003.

 

Wide Area Data Placement & Optimization:

 

9.     Adaptive Data Placement for Wide-Area Sensing Services, S. Nath et al., FAST 2005.

10.  Volley: Automated Data Placement for Geo-Distributed Cloud Services, S. Agarwal et al., NSDI 2010.

11.  The Case for RAMClouds: Scalable High-Performance Storage Entirely in DRAM, J. Ousterhout et al., SIGOPS OS Review 2009.

 

Cloud & Cluster Scheduling:

 

12.  Quincy: Fair Scheduling for Distributed Computing Clusters, M. Isard et al., SOSP 2009.

13.  Hedera: Dynamic Flow Scheduling for Data Center Networks, M. Al-Fares et al., NSDI 2010.

14.  Distributed Aggregation for Data-Parallel Computing: Interfaces and Implementations, Y. Yu et al., SOSP 2009.

 

MapReduce Improvements:

 

15.  MapReduce Online, T. Condie et al., NSDI 2010.

16.  Improving MapReduce Performance in Heterogeneous Environments, M. Zaharia et al., OSDI 2008.

 

Scalable Data Management:

 

17.  quFiles: The right file at the right time, K. Veeraraghavan et al., FAST 2010.

18.  Nectar: Automatic Management of Data and Computation in Datacenters, P. Gunda et al., OSDI 2010.

19.  Spyglass: Fast, Scalable Metadata Search for Large-Scale Storage Systems, A. Leung et al., FAST 2009.

20.  Adaptive File Transfers for Diverse Environments, H. Pucha et al., USENIX 2008.

 

Remote Data Access:

 

21. Structure and Performance of the Direct Access File System, K. Magoutis et al., USENIX 2002.

22.  RFS: Efficient and Flexible Remote File Access for MPI-IO,  J. Lee et al., CLUSTER 2004.

23.  A toolkit for user-level file systems,  D. Mazieres, USENIX 2001.

 

Global Scale Distributed Testbed Design:

 

24.  Experiences Building PlanetLab, L. Paterson et al., OSDI 2010.

25.  Large-scale Virtualization in the Emulab Network Testbed, M. Hibler et al., USENIX 2008.

26.  iPlane: An Information Plane for Distributed Services, H. Madhyastha et al., OSDI 2006.