The authors seek to address the challenge of efficiently locating the node that stores a particular data item in a peer-to-peer network. They present the Chord application which is a fully distributed peer-to-peer application, providing support for only a key map to the node that contains the item, with support for concurrent joining and leaving of nodes in the network without any centralized control by utilizing consistent hashing and techniques to provide lookup guarantees.
An interesting design choice they made is to provide just one operation but they show that it is sufficient to use for many other applications that can derive from this one operation. They provide a flat key space where each node is assigned a key and is assigned the key space below it until it's predecessor. They made a design choice to use consistent hashing and a finger table which slightly increases the complexity but enables more efficient lookups in O(log(n)) and provides support for concurrent joins and leaves. They also decided to make a predecessor pointer to further aid in the join and leave mechanisms. They have nodes run a stabilizing function periodically to clean up bad pointers from joins and leaves to guarantee that it will be able to reach each node eventually.
However, the authors also state that it is “unclear” if certain events could produce multiple disjoint cycles while stating later that the stabilizing function would not correct for it. This seems to me to be something they should investigate further. But overall, it comes across as a very stable, robust, completely distributed, and very efficient peer-to-peer application.
(credit: Mike Over)
The Dynamo is a distributed storage system that goes all out to provide extreme levels of availability. This is done by sacrificing certain amount of consistency during failures. The Dynamo is a highly decentralized, service oriented architecture that is loosely coupled. Dynamo has a modifiable quorum policy that allows it to be updated even during partitions or failures. It promises eventual consistency by using vector clocks to resolve write-conflicts. Much like the Chord system, Dynamo uses the consistent hashing technique for the keys in the system. This helps with the load distribution as well. Replication is used to provide high availability. Dynamo uses a Quorum-like system where a request is satisfied only if R + W > N. Where R is number of nodes participating in a read operation, W is the number of nodes participating in a write operation and N is the number of replicas that are created for each key.
The paper is interesting as it does not provide a new technique as such but uses an ensemble of pre-existing techniques to create an efficient system. Its availability guarantees are quite impressive as well.
It aims at ensuring that the writes in the system are seamless while partly sacrificing on the reads by addressing conflicts during read operations. This may not be the best approach as usually, in a marketplace like Amazon, I would assume that the reads outnumber the writes.
Also, I’m not convinced that client side resolution of write conflicts is the best approach.
(credit: Guru Prasad)