Vector Clock is a popular data structure for ensuring ordering of events in distributes systems. It is often used for achieving Eventual Consistency

**def:** a vector clock is a tuple $(t_1, ..., t_n)$ of clock values for each node, with $n$ nodes in total.

Notation: for a vector clock $v$, $v[i]$ is the value for node $i$

$v_1 < v_2$ if

- for all $i$: $v_1[i] \leqslant v_2[i]$
- for at least one $j$: $v_1[j] < v_2[j]$

$v_1 < v_2$ implies global time ordering

When data is written to a node $i$, it sets its timestamp $t_i$ to its clock value

Suppose we have two events $e_1$ with vector clock $v_1$, and $e_2$ with vector clock $v_2$

$e_1$ happens *before* $e_2$ if

- $v_1$ < $v_2$

$e_1$ happens *concurrently* to $e_2$ if

- there exist $i, j$ s.t. $v_1[i] < v_2[i]$ and $v_1[j] > v_2[j]$
- (cannot say if $v_1 < v_2$ or $v_2 < v_1$)

$e_1$ happens *after* $e_2$ if

- $v_1$ > $v_2$

- Every replica in a database keeps a list of number of updates it has seen.
- When an update comes, the replica increases its update counter in the vector clock
- and sends the new clock value with the update to other replicas
- if a read returns a conflicting version, application must reconcile the data and put it back to the database

Since the number of nodes may be big, may prefer to have sparse representation, i.e. to store the vector clock as **(node_id, version)** pairs

Here's the Voldemort implementation [1]

class **ClockEntry**

- has 2 fields: nodeId and verstion
- method incremented() returns a new ClockEntry with incremented version (same nodeId)

**VectorClock**

- 2 fields:
- versions: list of ClockEntry classes
- timestamp: time of the last update

- method incrementVersion(nodeId, time)

- finds the clock entry for the given node and increments the version; also updates the timestamp

- method incremented(nodeId, time)

- same as incrementVerstion, but returns a new VectorClock object

- getMaxVersion() traverses versions and returns the max one
- merge with another VectorClock. Creates a new VectorClock in which
- nodes are merged in sorted order (as in Merge Sort)
- if two nodes have the same nodeId, the max version is used

- method compare returns a value from enum
**{BEFORE, AFTER, CONCURRENT}**

Suppose we have 3 servers: $S_x$, $S_y$, $S_z$

- a client writes $D_1$ at $S_x$: $D_1([S_x, 1])$
- another client reads $D_1$ and writes back to $D_2$ (also handled by $S_x$)
- $D_2([S_x, 2])$ (no conflict and $D_1$ may get garbage collected afterwards)

- another client reads $D_2$, writes back $D_3$ handled by $S_y$: $D_3([S_x, 2], [S_y, 1])$
- at the same time someone else also reads $D_2$ and writes back $D_4$ handled by $S_z$
- $D_4([S_x, 2], [S_z, 1])$

- now we have two conflicting versions $D_3$ and $D_4$ (there is no causal relation between these versions, i.e. we cannot decide which one came first - the changes there are different)
- both versions must be kept and presented to the client

- with next read a client gets both $D_3$ and $D_4$, it merges them and writes back.
- this is handled by $S_x$: now the version is $D_4([S_x, 3], [S_y, 1], [S_z, 1])$

How to see if there is a conflict?

Data 1 | Data 2 | Conflict? |

$([S_x, 3], [S_y, 6])$ | $([S_x, 3], [S_z, 2])$ | Yes |

$([S_x, 3])$ | $([S_x, 5])$ | No |

$([S_x, 3], [S_y, 6])$ | $([S_x, 3], [S_y, 6], [S_z, 6])$ | No |

- Introduction to Data Science (coursera)
- Design Patterns for Distributed Nonrelational Databases
- Voldemort vector clock implementation [2]