Database Transaction Log Exercises

Exercises [1] and solutions [2]

Materials:


Exercise 1

Consider the following log:

  • $\langle T, \text{start} \rangle; \langle T, A, 10 \rangle; \langle T, B, 20 \rangle; \langle T, \text{commit} \rangle$
  • and events: Output($A$), Output($B$), Flush-Log($A$), Flush-Log($B$), Commit
  • what sequences of events are valid for this log?

Undo Logging

According Undo Logging#Undo Logging Rules, we have the following constraints (< = "occurs before")

  • Flush-Log($A$) < Output($A$) and Flush-Log($B$) < Output($B$)
    • should always flush before writing the modified item to disk
  • Flush-Log($A$) < Flush-Log($B$) < Commit
    • assume DB items are flushed on disk in the order they are created $\to$ flushing logs for $A$ before logs for $B$
    • both have to be flushed before we write the commit record
  • Output($A$) < Commit and Output($B$) < Commit
    • should write the modified database element to disk before the commit

Hence, the following sequences are legal:

  • Flush-Log($A$), Output($A$), Flush-Log($B$), Output($B$), Commit
  • Flush-Log($A$), Flush-Log($B$), Output($A$), Output($B$), Commit
  • Flush-Log($A$), Flush-Log($B$), Output($B$), Output($A$), Commit


Redo Logging

According Redo Logging#Redo Logging Rules, we have the following constraints (< = "occurs before")

  • Flush-Log($A$) < Output($A$) and Flush-Log($B$) < Output($B$)
    • same as for Undo Logging: should always flush before writing the modified item to disk
  • Flush-Log($A$) < Flush-Log($B$) < Commit
    • same as for Undo Logging: assume DB items are flushed on disk in the order they are created
    • both have to be flushed before we write the commit record
  • Commit < Output($A$) and Commit < Output($B$)
    • in contract to Undo Logging: can write modifications on disk only after the commit record was flushed

Hence, the following sequences are legal:

  • Flush-Log($A$), Flush-Log($B$), Commit, Output($A$), Output($B$)
  • Flush-Log($A$), Flush-Log($B$), Commit, Output($B$), Output($A$)


Undo/Redo Logging

According Undo/Redo Logging#Rules, we have the following constraints (< = "occurs before")

  • Flush-Log($A$), Output($A$) and Flush-Log($B$), Output($B$)
    • log records should appear before a database item is modified on disk
  • Flush-Log($A$) < Flush-Log($B$) < Commit
    • logs must be flushed before the commit record

This gives mush more sequences

  • Flush-Log($A$); Output($A$); Flush-Log($B$); Output($B$); Commit
  • Flush-Log($A$); Flush-Log($B$); Output($A$); Output($B$); Commit
  • Flush-Log($A$); Flush-Log($B$); Output($B$); Output($A$); Commit
  • Flush-Log($A$); Flush-Log($B$); Commit; Output($A$); Output($B$)
  • Flush-Log($A$); Flush-Log($B$); Commit; Output($B$); Output($A$)
  • Flush-Log($A$); Output($A$); Flush-Log($B$); Commit; Output($B$)
  • Flush-Log($A$); Flush-Log($B$); Output($A$); Commit; Output($B$)
  • Flush-Log($A$); Flush-Log($B$); Output($B$); Commit; Output($A$)


Exercise 2.1

Events

  • Start transaction $T$
  • $T$ modifies $A \leftarrow 11$ (was 10)
  • Start transaction $U$
  • Failure occurs

Questions:

  • what values might have changed?
  • how to recover?


Undo Logging

Log

$\langle T, \text{start} \rangle$
$\langle T, A, 10 \rangle$
$\langle U, \text{start} \rangle$
FAILURE

$A$ might have changed its value

Recall the rule:

  • scan backwards from the end to the beginning
  • undo things that have not committed

Recovering (while scanning backwards):

  1. see $\langle U, \text{start} \rangle$
    • we've successfully undone it
    • so write $\langle U, \text{abort} \rangle$
  2. see the modification $\langle T, A, 10 \rangle$
    • write 10 to $A$ (rewriting the old value back)
  3. see $\langle T, \text{start} \rangle$
    • we've successfully undone it
    • so write $\langle T, \text{abort} \rangle$


Redo Logging

Log

$\langle T, \text{start} \rangle$
$\langle T, A, 11 \rangle$
$\langle U, \text{start} \rangle$
FAILURE

$A$ cannot have changed its value - the crash occurred

  • no commit record $\langle T, \text{commit} \rangle$ was written to disk
  • no need to redo it

Recall the rule:

  • scan from the beginning to the end
  • redo things that have committed but were not flushed to disk

Recovering (while scanning forwards):

  1. see $\langle T, \text{start} \rangle$
    • write $\langle T, \text{abort} \rangle$
  2. see the modification $\langle T, A, 11 \rangle$
    • do nothing - it has not changed the value on disk
  3. see $\langle U, \text{start} \rangle$
    • write $\langle U, \text{abort} \rangle$


Undo/Redo Logging

Log

$\langle T, \text{start} \rangle$
$\langle T, A, 10, 11 \rangle$
$\langle U, \text{start} \rangle$
FAILURE

$A$ might have changed its value

  • recover in the same way as for Undo Logging

Recovering (while scanning backwards):

  1. see $\langle U, \text{start} \rangle$
    • we've successfully undone it
    • so write $\langle U, \text{abort} \rangle$
  2. see the modification $\langle T, A, 10, 11 \rangle$
    • write 10 to $A$ (rewriting the old value back)
  3. see $\langle T, \text{start} \rangle$
    • we've successfully undone it
    • so write $\langle T, \text{abort} \rangle$


Exercise 2.2

Events

  • Start transaction $T$
  • $A \leftarrow 11$ (was 10) by $T$
  • Start transaction $U$
  • $B \leftarrow 21$ (was 20) by $U$
  • $C \leftarrow 31$ (was 30) by $T$
  • $D \leftarrow 41$ (was 40) by $U$
  • $U$ commits
  • Failure occurs

Questions:

  • what values might have changed?
  • how to recover?


Undo Logging

Log

$\langle T, \text{start} \rangle$
$\langle T, A, 10 \rangle$
$\langle U, \text{start} \rangle$
$\langle U, B, 20 \rangle$
$\langle T, C, 30 \rangle$
$\langle U, D, 40 \rangle$
$\langle U, \text{commit} \rangle$
FAILURE

Only transaction $T$ has to be undone

  • $U$ committed

Changed values:

  • $A$ and $C$ might have changed the values (don't see $\langle T, \text{commit} \rangle$)
  • $B$ and $D$ have changed the values (see $\langle U, \text{commit} \rangle$)


Recovering (while scanning backwards):

  1. see $\langle U, \text{commit} \rangle$
    • ignoring changes of $U$ altogether
  2. see $\langle U, D, 40 \rangle$
    • skipping it
  3. see $\langle T, C, 30 \rangle$
    • write value 30 back to $C$
  4. see $\langle U, B, 20 \rangle$ and $\langle U, \text{start} \rangle$
    • skipping it
  5. see $\langle T, A, 10 \rangle$
    • write value 10 back to $C$
  6. see $\langle T, \text{start} \rangle$
    • we've successfully undone it
    • so write $\langle T, \text{abort} \rangle$


Redo Logging

Log

$\langle T, \text{start} \rangle$
$\langle T, A, 11 \rangle$
$\langle U, \text{start} \rangle$
$\langle U, B, 21 \rangle$
$\langle T, C, 31 \rangle$
$\langle U, D, 41 \rangle$
$\langle U, \text{commit} \rangle$
FAILURE

Need to redo only transaction $U$

  • only $U$ has a commit record $\langle U, \text{commit} \rangle$
  • $T$ has not committed - no need to redo it

Changed values

  • $A$ and $C$ have not changed ($T$ has not committed)
  • $B$ and $D$ might have changed ($U$ has committed)

Recovering (while scanning forwards):

  1. $\langle T, \text{start} \rangle$
    • we know that we ignore changes of $T$ - ignoring it
  2. $\langle T, A, 11 \rangle$
    • skipping
  3. $\langle U, \text{start} \rangle$
  4. $\langle U, B, 21 \rangle$
    • writing 21 to $B$
  5. $\langle T, C, 31 \rangle$
    • skipping
  6. $\langle U, D, 41 \rangle$
    • writing 41 to $D$
  7. $\langle U, \text{commit} \rangle$
  8. write $\langle T, \text{abort} \rangle$ to log
    • don't write $\langle U, \text{abort} \rangle$ to log


Undo/Redo Logging

Log

$\langle T, \text{start} \rangle$
$\langle T, A, 10, 11 \rangle$
$\langle U, \text{start} \rangle$
$\langle U, B, 20, 21 \rangle$
$\langle T, C, 30, 31 \rangle$
$\langle U, D, 40, 41 \rangle$
$\langle U, \text{commit} \rangle$
FAILURE


Changed values:

  • everything might have changed
    • we don't know whether it was before or after the commit when the database elements were supposed to be modified
    • so need to undo all not-committed transactions
    • and redo all committed transactions
  • all $A$, $B$, $C$, $D$ might have changed their values


  • $U$ must be redone (it has the commit record $\langle U, \text{commit} \rangle$)
  • $T$ must be undone (it doesn't have a commit record $\langle T, \text{commit} \rangle$)


Recovering:

forwards (redoing) - same as for Redo Logging:

  • ignoring changes of $T$
  • write 21 to $B$
  • write 41 to $D$

backwards (undoing) - same as for Undo Logging

  • ignore changes of $U$
  • write 30 to $C$
  • write 10 to $A$
  • write $\langle T, \text{abort} \rangle$ to log


Exercise 3

Question

  • for a given log, where a $\langle \text{end ckpt} \rangle$ can be added?
  • what happens if a crash occurs?


Undo Logging

Consider this log

$\langle S, \text{start} \rangle$
$\langle S, A, 60 \rangle$
$\langle S, \text{commit} \rangle$
$\langle T, \text{start} \rangle$
$\langle T, A, 10 \rangle$
$\langle \text{start ckpt} \rangle$
  • here it should identify the active transactions
  • hence it's $\langle \text{start ckpt} (T) \rangle$ ($S$ already committed)
  • we can add $\langle \text{end ckpt} \rangle$ only once $T$ commits
$\langle U, \text{start} \rangle$
$\langle U, B, 20 \rangle$
$\langle T, C, 30 \rangle$
$\langle V, \text{start} \rangle$
$\langle U, D, 40 \rangle$
$\langle V, F, 70 \rangle$
$\langle U, \text{commit} \rangle$
$\langle T, E, 50 \rangle$
$\langle T, \text{commit} \rangle$
  • since $T$ has committed, here can add $\langle \text{end ckpt} \rangle$
$\langle V, B, 80 \rangle$
$\langle V, \text{commit} \rangle$


Recovery:

  • depends on whether we first meet $\langle \text{end ckpt} \rangle$ or $\langle \text{start ckpt} \rangle$
  • if $\langle \text{end ckpt} \rangle$ - go backwards till $\langle \text{start ckpt} (T) \rangle$
  • if $\langle \text{start ckpt} (T) \rangle$ - then go backwards till $\langle T, \text{start} \rangle$ ($T$ was the only active transaction when the checkpoint started)


Redo Logging

Consider this log

$\langle S, \text{start} \rangle$
$\langle S, A, 61 \rangle$
$\langle S, \text{commit} \rangle$
$\langle T, \text{start} \rangle$
$\langle T, A, 11 \rangle$
$\langle \text{start ckpt} \rangle$
  • we keep that on the transactions that have committed at this point, but have not yet flushed the modifications to disk: we wait until they do that
  • only after that we end the checkpoint (put $\langle \text{end ckpt} \rangle$)
  • The only transaction that has committed is $S$, so the $\langle \text{end ckpt} \rangle$ can occur anywhere after this record: we cannot predict when exactly the dirty blocks will be written to disk
  • the log record actually is $\langle \text{start ckpt} (T) \rangle$ since $T$ is the only active transaction at this point
$\langle U, \text{start} \rangle$
$\langle U, B, 21 \rangle$
$\langle T, C, 31 \rangle$
$\langle V, \text{start} \rangle$
$\langle U, D, 41 \rangle$
$\langle V, F, 71 \rangle$
$\langle U, \text{commit} \rangle$
$\langle T, E, 51 \rangle$
$\langle T, \text{commit} \rangle$
$\langle V, B, 81 \rangle$
$\langle V, \text{commit} \rangle$

Recovery

  • also depends when the crash occurs
  • $\langle \text{end ckpt} \rangle$ is last, then we know that $S$ was written fully
    • transactions that we active at start or started later must be redone
    • these transactions are $T$, $U$ and $V$
  • $\langle \text{start ckpt} (S) \rangle$ is last - need to go to the previous checkpoint end


Undo/Redo Logging

For this we do the same reasoning as for Redo Logging

$\langle S, \text{start} \rangle$
$\langle S, A, 60, 61 \rangle$
$\langle S, \text{commit} \rangle$
$\langle T, \text{start} \rangle$
$\langle T, A, 10, 11 \rangle$
$\langle \text{start ckpt} \rangle$
  • again cannot predict where exactly to put $\langle \text{end ckpt} \rangle$
  • it will be written once $S$ flushes its modifications to disk
$\langle U, \text{start} \rangle$
$\langle U, B, 20, 21 \rangle$
$\langle T, C, 30, 31 \rangle$
$\langle V, \text{start} \rangle$
$\langle U, D, 40, 41 \rangle$
$\langle V, F, 70, 71 \rangle$
$\langle U, \text{commit} \rangle$
$\langle T, E, 50, 51 \rangle$
$\langle T, \text{commit} \rangle$
$\langle V, B, 80, 81 \rangle$
$\langle V, \text{commit} \rangle$

Recovery

  • if last is $\langle \text{end ckpt} \rangle$
    • all dirty blocks were written to disk
    • need to redo only active transactions from the moment when $\langle \text{start ckpt} \rangle$ occurred (i.e. $T$)
    • for that may need to go further - to $\langle T, \text{start} \rangle$
  • if last is $\langle \text{start ckpt} (T) \rangle$
    • go back to the previously successfully completed checkpoint (or the beginning of the log)


Sources

Machine Learning Bookcamp: Learn machine learning by doing projects. Get 40% off with code "grigorevpc".

Share your opinion