ML Wiki

Concurrency Control

A database typically serves multiple users at the same time

• the goal in this case is to make an impression that everything works in isolation
• Concurrency Control deals with that: it ensures that transactions have the same effect as if they were run in isolation

Conflicts

Problem:

• one transaction $T_1$ is reading a DB item that was written by another transaction $T_2$
• without $T_2$ having completed
• this results in Write-Read Conflict

Illustrative Example

• suppose there are two transactions $T_1$ and $T_2$
• $T_1$ transfers 100 USD from one account to another
• $T_2$ gives 6% interest rate to all accounts
$T_1$ $T_2$
Read($A, s$) read from $A$ to main-memory variable $s$
$s \leftarrow s - 100$ withdraw 100 USD from $A$
Write($A, s$) persist the result to disk
Read($A, t$)
$t \leftarrow t \times 1.06$ give 6% interest rate to $A$
Write($A, t$) note that at this moment the money hasn't appeared at $B$ yet
Read($B, t$)
$t \leftarrow t \times 1.06$ give 6% interest rate to $B$
Write($B, t$)
Read($B, s$)
$s \leftarrow s + 100$ put 100 USD to $B$
Write($B, s$) persist the result to disk

Assume both $A$ and $B$ have 100 USD

• $A$ won't get any interest (no money at the moment of interest calculation)
• $B$ will get interest only on 100 USD instead of 200 USD

Reason:

• two transactions interleaved,
• it should be either
• first all actions of $T_1$ and then all actions of $T_2$ or
• first all actions of $T_2$ and then all actions of $T_1$

Write-Write Conflict

Problem:

• both transactions are just writing new values without reading the old ones

Example

• $T_1$ puts 100 USD to both $A$ and $B$
• $T_2$ puts 200 USD to both $A$ and $B$
$T_1$ $T_2$
$s \leftarrow 100$
Write($A, s$)
$t \leftarrow 200$
Write($A, t$)
$t \leftarrow 200$
Write($B, t$)
$s \leftarrow 100$
Write($B, s$)

We want the transactions to run in some sequence

• as if they were not allowed to interleave

Not consistent state:

• $B$ has 100 USD, $A$ has 200

Consistent state:

• both have 200 USD (first $T_1$ then $T_2$)
• both have 100 USD (first $T_2$ then $T_1$)

Scheduling

• The Scheduler is responsible for creating an impressions that all transactions are run in isolation

Distributed Concurrency

• Multi-Master
• Master/Slave
• Partitioning
• Sharding
• Write thought Caches

Sources

Machine Learning Bookcamp: Learn machine learning by doing projects. Get 40% off with code "grigorevpc".