CAP stands for Consistency, Availability and Partition tolerance
In a consistent system values of any objects don't contradict each other
- Do all applications see the same data?
An available system is always usable
- If some nodes fail, does everything still works?
If two parts of your system cannot communicate to each other, can they proceed on their own?
- if not - sacrifice availability
- if yes - you need to sacrifice consistency
If a system is partition-tolerant, then it can continue to operate even in presence of failures
Theorem: It is impossible to implement a distributed system which will have all three mentioned properties. Only 2 of the 3 is possible to achieve
- Consequently, one of the 3 must be abandoned
- Different databases choose different options:
Consistency and Availability
- But no Partition tolerance
- This is typically preferred by RDBMSs - which is why they usually don't offer scalability
- Easy to achieve ACID under C+A
- Need special algorithms to ensure consistency (like Two-Phase Commit)
Consistency and Partition tolerance
- But no availability
- HBase chooses consistency and partitioning (no availability)
Availability and Partition tolerance