Top Critiques for Week 9: PNUTS and Spanner

(in no particular order)

PNUTS is a massively parallel geographically distributed database system. It is designed by Yahoo and used in their web applications. PNUTS is shared between several Yahoo applications.

PNUTS is designed to

PNUTS follows a relational data model that has been quite significantly simplified. PNUTS uses tablets to store records and these tablets can be associated with different servers. PNUTS allows arbitrary structures within records - blobs. The schema is flexible - allows empty attributes, addition of new attribute is not blocking. The primary-key space is split up into intervals. These intervals correspond to individual tablets.

PNUTS aims to provide per-record consistency based on timeline. All replicas of a record will apply updates in the same order. PNUTS uses the pub/sub system - Yahoo! Message Broker system to act as the log for the distributed database. There is no explicit redo log available. Since YMB is already designed for wide-area replication, it works well to handle database replication and provide relaxed consistency guarantees.

It would have made more sense for Yahoo to measure their performance against a competitor's solution such as BigTable rather than measure performance relative to the basic PNUTS system that uses Hash Tables. Like most NoSQL solutions, eventual consistency is something that is difficult for a developer to account for while programming.

(credit: Guru Prasad)

PNUTS is a solution for Yahoo's need for a large scale distributed data store for their web applications such as Flickr. PNUTS guarantees their requirements of scalability, good response time, high availablity and fault tolerance.

Even Dynamo provides the same with eventual consistency. The authors of the paper have presented an example where eventual consistency can be weak for high availability. Eventual consistency doesnot provide any guarantee about the order in which the updates on several replicas are seen to the user. It just guarantees that eventually all the replicas will be consistent.

To address this problem, PNUTS provide a solution as Timeline consistency. Yahoo uses a message brokering service which guarantees totally ordered delivery of messages. This ensures that any replica doesnot receive the second update before the first updates and order of updates gain consistency. Another design decision is that records are stored in tables which are split into tablets.

From the experiments, it can be seen that the latency is very high, (Fig 3), I wonder if it is a con for the system. I feel other assumptions and observations correct.

(credit: Anonymous)

Spanner is a global scale distributed version database which uses a novel API called TrueTime to provide externally consistent transactions at global scale. It's developed by Google and being currently used to host their advertising data. Scanner is partially relational database in that it provides ACID properties for its transactions which other NoSQL databases do not. It also enables users to perform transactions across datacenters with low latency.

Spanner universe is divided into zones which contains Zone master which governs span servers that server data to clients and location proxy server which help clients locate data in span servers. Spanner is built on top of Colossus (file system) where the tablets reside and each tablet has a Paxos server which stores log info and meta data related to the tablet. Data is divided into buckets called 'directories' which is the smallest unit of data placement.

Spanner's key innovation is use of special hardware (GPS and atomic clocks) to get time and use of TrueTime API. Spanner acknowledges the uncertainty in time and provides a bound to the uncertainty by provided methods like now() which not only returns time but bound which specifies a range in which the exact time occurs. Spanner leverages this bounded uncertainty to provide various features like lock free read transactions, external consistency in writes at global scale etc.

Spanner took NoSQL database like Big Table and took it in RDMBS direction with key innovations in time semantics. There are no comparision studies yet with similar system like PNUTS and since both are currently proprietary systems, it might be difficult to do such a study as of now. Spanner also requires servers fitted with special hardware for its time semantics to work.

(credit: Ravi Theja M.)