Saturday 19 October 2013

Eventual Consistency


What i got from the Post by Werner Vogels on Consisency

Distribution Transparency : To the user of the system it appears that there is only one system.

Consistency : Client Point of View
4 components :
1)Storage System
2)Process A : A process that writes to and reads from a Storage System
3)Process B&C : 2 process indepandent of process A . Reads and Writes to SS.
Client Side consistency is how and when client sees update made to data object in storage System.
If A updates
1)Strong Consistency : Once update is compleate any subsequent access will return updated value.
2)Weak Consistency : The system does not guarantee that subsequent accesses will return the updated value. A number of conditions need to be met before the value will be returned. Often this condition is the passing of time. The period between the update and the moment when it is guaranteed that any observer will always see the updated value is dubbed the inconsistency window.
3)Eventual Consistency: The storage system guarantees that if no new updates are made to the object eventually (after the inconsistency window closes) all accesses will return the last updated value. The most popular system that implements eventual consistency is DNS, the domain name system. Updates to a name are distributed according to a configured pattern and in combination with time controlled caches, eventually of client will see the update.

Server Side :
N : No of nodes that store replica of data
W:No of node that needs to ack the reciept of update before the update is received
R:The no of nodes that are contacted when data abject accessed through read operation
If W+R > N
Problem here is if 3 nodes are write and 2 fails then system has to fail the write as otherwise system will become unavailable.

In distributed storage systems that need to address high-performance and high-availability the number of replicas is in general higher than 2. Systems that focus solely on fault-tolerance often use N=3 (with W=2 and R=2 configurations). Systems that need to serve very high read loads often replicate their data beyond what is required for fault-tolerance, where N can be tens or even hundreds of nodes and with R configured to 1 such that a single read will return a result. For systems that are concerned about consistency they set W=N for updates.

R=1 W=N optimise Read
R=N W=1 very fast write

No comments:

Post a Comment