CSE 704: Top Critiques for Week 7

Top Critiques for Week 7: Percolator and ZooKeeper

(in no particular order)

Zookeeper addresses the need of coordination of distributed systems. It attempts to expose an API to allow for users to build distributed systems around, without the need of using locks. It's use case is a read-heavy workload, as its strategy to allow for linearization of requests is aimed at this. It also is designed to be wait-free to prevent performance degregation. By limiting their API to core components, they allow applications to leverage their API to build fault tolerant, distributed applications, that is decentralized. Their implementation accomplished this reasonably well, for certain applications. Without having a read-heavy workload, however, much of this is not the case, due to not being able to use the local replica.

(credit: Jon Logan)

ZooKeeper is a simple coordination service for distributed applications which is built in such a way that the applications can implement complex services using the simple set of operations provided by it. The chief design goal is to relieve distributed applications the work of coordination so that they need not worry about things like deadlocks or race conditions etc.

The data model of ZooKeeper is very similar to that of hierarchical file system but differs in a way that ZooKeeper namespace allows each node (called znodes) to have data related to its children too. ZooKeeper also uses a concept called watches which are nothing but temporary timers associated with znodes which set off when a znode changes it state. This is used to implement conditional updates. It also provides an easy to use client API which support only basic operations but can be used to build higher level operations. This can be viewed as a limitation because even for moderately complex operation, the user had to write user defined operations.

Finally, the paper presents ZooKeeper implementations inside Yahoo (where it is developed) and evaluates it for throughput and latency on a cluster of 50 servers. The key drawback I found in the system is that it doesn’t talk about the transaction logs but leaves them for the developer who uses the system to maintain, this might prove costly if a need arises for debugging the system at some point.

(credit: Ravi Theja M.)

Percolator is a system for incrementally processing updates to a large data set, and deployed it to create the Google web search index. By replacing a batch-based indexing system with an indexing system based on incremental processing using Percolator, processing the same number of documents per day, the average age of documents in Google search results is reduced by half.

Good points:

Despite this scalability, rerunning a MapReduce pipeline on each small batch of updates results in unacceptable latency and wasted work. Over- lapping or pipelining the adjacent stages can reduce latency, but straggler shards still set the minimum time to complete the pipeline. Percolator avoids the expense of repeated scans by, essentially, creating indexes on the keys used to cluster documents while MapReduce did not support such indexes.

Weak points:

The authors chose an architecture that scales linearly over many orders of magnitude on commodity machines, but they said this costs a significant 30- fold overhead compared to traditional database architectures. The tradeoff between how much overhead is fundamental to distributed storage systems, and how much can be optimized away should be further explored.

(credit: Anonymous)