CSE
726 – Seminar on ÒData Intensive Distributed ComputingÓ
Spring 2011 – Papers to be discussed
Parallel Cluster
File Systems:
1. GPFS:
A Shared-Disk File System for Large Computing Clusters, F. Schmuck et
al., FAST 2002.
2. PVFS: A
Parallel File System for Linux Clusters, P. Carns et al., Linux Conf. 2000.
3. Black-Box
Problem Diagnosis in Parallel File Systems, M. Kasick et al., FAST 2010.
4. Panache:
A Parallel File System Cache for Global File Access, M. Eshel et al., FAST 2010.
Wide Area
Distributed File Systems:
5. Availability
in Globally Distributed Storage Systems, D. Ford et al., OSDI 2010.
6. Safety,
Visibility, and Performance in a Wide-Area File System, M. Kim et al.,
FAST 2002.
7. Integrating
Portable and Distributed Storage, N. Tolia et al., FAST 2003.
8. The
Google File System, S. Ghemewat., SOSP 2003.
Wide Area Data
Placement & Optimization:
9. Adaptive Data Placement
for Wide-Area Sensing Services, S. Nath et al., FAST 2005.
10. Volley:
Automated Data Placement for Geo-Distributed Cloud Services, S. Agarwal et
al., NSDI 2010.
11. The Case for
RAMClouds: Scalable High-Performance Storage Entirely in DRAM, J. Ousterhout et
al., SIGOPS OS Review 2009.
Cloud &
Cluster Scheduling:
12. Quincy: Fair
Scheduling for Distributed Computing Clusters, M. Isard et al., SOSP 2009.
13. Hedera:
Dynamic Flow Scheduling for Data Center Networks, M. Al-Fares et al., NSDI 2010.
14. Distributed
Aggregation for Data-Parallel Computing: Interfaces and Implementations, Y. Yu et al.,
SOSP 2009.
MapReduce
Improvements:
15. MapReduce Online, T. Condie et al.,
NSDI 2010.
16. Improving
MapReduce Performance in Heterogeneous Environments, M. Zaharia et al., OSDI 2008.
Scalable Data
Management:
17. quFiles:
The right file at the right time, K. Veeraraghavan et al., FAST 2010.
18. Nectar:
Automatic Management of Data and Computation in Datacenters, P. Gunda et
al., OSDI 2010.
19. Spyglass:
Fast, Scalable Metadata Search for Large-Scale Storage Systems, A. Leung et al.,
FAST 2009.
20. Adaptive
File Transfers for Diverse Environments, H. Pucha et al., USENIX 2008.
Remote Data
Access:
21. Structure
and Performance of the Direct Access File System, K. Magoutis et al., USENIX 2002.
22. RFS:
Efficient and Flexible Remote File Access for MPI-IO, J. Lee et al., CLUSTER 2004.
23. A
toolkit for user-level file systems, D. Mazieres, USENIX 2001.
Global Scale
Distributed Testbed Design:
24. Experiences
Building PlanetLab, L. Paterson et al., OSDI 2010.
25. Large-scale
Virtualization in the Emulab Network Testbed, M. Hibler et al., USENIX 2008.
26. iPlane:
An Information Plane for Distributed Services, H. Madhyastha et al., OSDI 2006.