Eventual Consistency

What is eventual consistency and why would we bother with it? Let’s first start by taking a look at the CAP theorem.

What is a theorem? For that we could search in on-line (just for speed) dictionaries and come up with some of the acceptable definitions. Please note that the most words have different definitions depending on how they are applied. The word “theorem” falls into such category. I am going to use the set of definitions from Dictionary.com. The word is a noun. It has different meanings in mathematics, logic, as a rule or law, and as an idea, belief, method, or statement generally accepted as true without a proof.

What is the CAP Theorem? In this case let’s look at what Wikipedia has to say. Please note that I have edited the search results using additional sources and my experience.

In theoretical computer science, the CAP theorem, also named Brewer’s theorem after computer scientist Eric Brewer, states that it is impossible for a distributed data store to simultaneously provide more than two out of the following three guarantees:

Guarantee Brief Description
Consistency A read is guaranteed to return the most recent write for a given client.
Availability Every request receives a (non-error) response – without guarantee that it contains the most recent write.
Partition tolerance The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes.

In other words, the CAP theorem states that in the presence of a network partition (e.g., a set of Microservices running on different hosts), one has to choose between consistency and availability. Note that consistency as defined in the CAP theorem is quite different from the consistency guaranteed in ACID (Atomicity, Consistency, Isolation, Durability; a set of properties of database transactions intended to guarantee validity even in the event of errors, power failures, etc).

My simple view on the CAP theorem is that of three pillars supporting a plane representing the operation of a distributed store. Allow me to expand on this. In order to have a plane, one needs to have at least three points. Let’s assume that each, Consistency, Availability and Partition tolerance provide one of the points. The theorem (in my opinion heuristic) is that only two points made de made available at a time given us a line and not a plane.

I have worked on storage systems for over a couple decades. Fully understand the concept of clients storing data files which need to be replicated for consistency and to facilitate remote access to other clients that are closer to one of the remote replicas. I have not worked much with HTTP requests that return small amounts of data. I have mostly worked with larger files in the order of a few megabytes to gigabytes in size. The contents of the files are not relevant. In most cases they contained medical images in DICOM format.

I have taken my fair share of mathematics in college. I am a computer scientist, not a mathematician. Last week I read in a book that the CAP theorem has been mathematically proved. I am not able to understand the existence of a mathematical prove for it. I have spent several minutes searching the web without luck. I did send the author of the book a message to see if he is able to provide a copy or a link to a credible paper that supports such claim.

I will be posting in the near future some code written in Java on GitHub with a simple approach to achieve eventual consistency with files when using a distributed (network partitioned) store.

If you have comments or questions regarding this or any other post in this blog, please do not hesitate and leave me a message. I will respond to it as soon as possible.

Enjoy;

John

Follow me on Twitter:  @john_canessa

Leave a Reply

Your email address will not be published. Required fields are marked *