Intro

Each node in a Logical Query Plan may be executed in several ways

  • there is no single implementation that is always better than the others
  • so we need to compare alternatives based on their costs (in I/O Model of Computation it's # of I/O operations)

Statistics

To estimate a cost we use the following statistics from Database System Catalog:

  • $B(R)$ - # of blocks that relation $R$ holds
  • $T(R)$ - # of tuples in $R$
    • typically can be used to calculate $B(R)$ when we know how many bytes we have per block
  • $V(R, A_1, ..., A_n) = | \delta \pi_{A_1, ..., A_n} (R) |$ - # of distinct values

Parameters

And also we use

  • $M$ - number of available main memory buffers

To simplify we suppose that the situation is ideal

  • all buffers of Buffer Manager are available
  • there are no other operations that concurrently claim the space


Operators Overview

Type $\downarrow$ Bag Union Set Union Set Intersection Bag Intersection Set Difference Bag Difference Join
One-Pass
Sort-Based
Hash-Based not here not here not here not here

Also:


Bag Union

$R \cup_B S$

  • all elements from both
  • no need to remove duplicates
  • for this operation just need to use one buffer block
  • load elements to the block and then just output

Algorithm

  • for each block $B_R \in R$
    • load $B_R$ to buffer $N$
    • for each tuple $t_R \in N$
      • output $t_R$
  • for each block $B_S \in S$
    • load $B_S$ to buffer $N$
    • for each tuple $t_S \in N$
      • output $t_S$

Cost

  • $B(R) + B(S)$
  • we don't count output
  • and we must have some available buffers: $M \geqslant 1$


Set Union

$R \cup_B S$

  • this time we care about duplicates
  • assume that $R$ is smaller than $S$

One-Pass Set Union

$R$ fits into memory:

  • We assume that we have $B(R) + 1$ available buffers
  • i.e. $R$ can fit into memory and after that there's at least one remaining available buffer

Idea:

  • load all elements of $R$
  • using 1 extra buffer go through all blocks of $S$
  • output element if only it's not in $R$

Algorithm

  • load $R$ to $N_1, ..., N_{B(R)}$ buffers
  • for each tuple $t_R \in \cup_i N_i$
    • output $t_R$
  • for each block $B_S \in S$
    • load $B_S$ to $N_0$
    • for each tuple $t_S \in N_0$
      • if $t_S \not \in \cup_i N_i$, output $t_S$

Cost

  • $B(R) + B(S)$
  • pass once through $R$ and once through $S$
  • we ignore the cost of searching in memory - interested only in I/O cost

Problem:

  • We can do it only when one of the relations fin into memory
  • usually they don't fit!


Sort-Based Set Union

what if there's no enough memory available?

Algorithm:

Synchronous Iteration:

  • load block of $R$ to $N_R$, block of $S$ to $N_S$
  • iterate over tuples $t_R \in N_R$ and $t_S \in N_S$ synchronously
    • if $t_R < t_S$
      • output $t_R$
      • move $t_R$ pointer to the next tuple in $R$ (load next block if needed)
    • if $t_R > t_S$
      • output $t_S$
      • move $t_S$ pointer to the next tuple in $S$ (load next block if needed)
    • if $t_R = t_S$
      • output $t_S$
      • move both $t_R$ and $t_S$

Cost:

  • $2 B(R) \lceil \log_M B(R) \rceil$: cost of sorting $R$
  • $2 B(S) \lceil \log_M B(S) \rceil$: cost of sorting $S$
  • $B(R) + B(S)$ to iterate synchronously to output the results


Optimization

  • Synchronous Iteration of Sort-Based Union is very similar to the merge phase of External Merge Sort
  • Sometimes we can combine them - and avoid doing the last pass of Merge Sort!

Algorithm:

  • Sort $R$, but don't execute the last merge phase
    • we know that after that $R$ is divided into $l$ sorted lists
    • $1 < l \leqslant M$
  • Sort $S$, but don't execute the last merge phase
    • we know that after that $S$ is divided into $k$ sorted lists
    • $1 < k \leqslant M$
  • if $l + k \leqslant M$ then we can apply the optimization
    • because there are enough buffers available to synchronously iterate through both set of sub-results


Cost:

  • $2 B(R) \big( \lceil \log_M B(R) \rceil - 1 \big)$: cost of sorting $R$ without last pass
  • $2 B(S) \big( \lceil \log_M B(S) \rceil - 1 \big)$: cost of sorting $S$ without last pass
  • $B(R) + B(S)$ to iterate synchronously to output the results
  • or $2 B(R) \lceil \log_M B(R) \rceil + 2 B(S) \lceil \log_M B(S) \rceil - B(R) - B(S)$
  • we save $B(R) + B(S)$ I/Os!


Note that

  • this optimization is only possible if $k + l \leqslant M$
  • can calculate $k$ and $l$ as
    • $l = \left\lceil \cfrac{B(R)}{ M^{\lceil \log_M B(R) \rceil - 1} } \right\rceil $ - # of passes to sort $R$ - 1
    • $k = \left\lceil \cfrac{B(S)}{M^{\lceil \log_M B(S) \rceil - 1}} \right\rceil$ - # of passes to sort $S$ - 1

it's usually sufficient to have only 2 passes for sorting

  • in this case can apply optimization if
  • $\left\lceil \cfrac{B(R)}{M} \right\rceil + \left\lceil \cfrac{B(S)}{M} \right\rceil \leqslant M$ or
  • $B(R) + B(S) \leqslant M^2$
  • cost in this case is $3B(R) + 3B(S)$ with optimization
  • $5B(R) + 5B(S)$ without optimization


Example

Suppose $M = 15, B(R) = 100, B(S) = 120$

  • to sort $R$ need $\lceil \log_M B(R) \rceil = 2$ passes
  • to sort $S$ need $\lceil \log_M B(S) \rceil = 2$ passes
  • $l = \left\lceil \cfrac{100}{15^2} \right\rceil$
  • $k = \left\lceil \cfrac{120}{15^2} \right\rceil$
  • $l + k = 15 \leqslant M$, therefore we can apply optimization
  • cost is $2 \cdot 100 \cdot 2 + 2 \cdot 120 \cdot 2 - 100 - 120 = 660$


Hash-Based Set Union

Main idea: we want to partition both $R$ and $S$ in such a way that

  • if a tuple appears in some bucket from $R$ it should appear in the corresponding bucket from $S$
  • each bucket contains no more that $M - 1$ blocks
  • so it is possible to apply One-Pass Set Union to each bucket

Record distribution

  • $B(R) < B(S)$ - $R$ is smaller than $S$
  • we suppose that we can partition $R$ in $k$ buckets $R_i$
    • to do that we apply some hash function $h$
  • and then distribute tuples from $S$ also into $k$ buckets $S_i$
    • also by applying $h$
  • all records in $R_i$ and $S_i$ ended up in the bucket $i$ because they have the same hash value
    • if there's a record that occurs both in $R_i$ and $S_i$, it's a duplicate
    • and we need to consider only these buckets, this record cannot appear in buckets other than $i$


Computing Set Union

  • to compute set union we compute unions for all buckets $i$: $R_i \cup_S S_i$
  • since $R_i$ contains at most $M - 1$ block can do that in one pass


Partitioning $R$

How to partition $R$ in blocks of size at most $M - 1$?

Algo

  • first pass
    • we load each block of $R$ into buffer $N_0$
    • and we have $M - 1$ remaining buffers - we will use them as buckets
    • for each tuple $t_R \in R$ we calculate $h(t_R)$ to find which bucket it belongs to
    • once a bucket buffer is full, we flush it to disk to some block, empty the buffer and continue
    • $\Rightarrow$ after first pass we'll have $M - 1$ buckets of $\cfrac{B(R)}{M - 1}$ blocks (assuming $h$ distributes records uniformly)
  • 2+ passes
    • if there are buckets that have more that $M - 1$ blocks we need to hash them again
    • but this time with another hash function $h'$ (otherwise the old $h$ will just put all the tuples back to the same bucket)
    • so we repeat the first pass again, but for each overfull bucket separately
    • $\Rightarrow$ after second pass we'll have $(M - 1)^2$ buckets of $\cfrac{B(R)}{(M - 1)^2}$ blocks
  • continue this process until all buckets have no more than $M - 1$ blocks
    • $\Rightarrow$ after $k$ passes we'll have $(M - 1)^k$ buckets of $\cfrac{B(R)}{(M - 1)^k}$ blocks


One level of partitioning is usually enough

  • so we need two passes
  • 1st: to partition
  • 2nd: to do pair-wise single pass unions of buckets
  • it's sufficient when $\cfrac{B(R)}{M - 1} \leqslant M - 1$ or $\approx B(R) \leqslant M^2$

Cost of partitioning

  • $2B(R) \underbrace{\lceil log_{M - 1} B(R) - 1 \rceil}_\text{# of passes}$:
  • at each pass we read and write $R$ once
  • so for one pass it's just $2B(R)$


Cost

Total cost of partitioning

  • $2B(R) \lceil \log_{M - 1} B(R) - 1 \rceil$ for $R$
  • $2B(S) \lceil \log_{M - 1} B({ \color{blue}{R} }) - 1 \rceil$ for $S$
    • note that number of passes for $S$ is the same as of $R$
  • $B(R) + B(S)$ one pass union for each bucket
  • if one pass is sufficient, then the total cost is $3B(R) + 3B(R)$


Set Intersection

$R \cap_S S$

  • assume $R$ is smaller than $S$

One-Pass Set Intersection

Essentially same as #One-Pass Set Union

  • if $R$ is small enough to fit into $M - 1$ buffers

Algorithm

  • load $R$ to $K_R = N_1, ..., N_{B(R)}$ buffers
  • for each block $B_S \in S$
    • load $B_S$ to $N_0$
    • for each tuple $t_S \in N_0$
      • if $t_S \in K_R$, output $t_S$

Sort-Based Set Intersection


Bag Intersection

$R \cap_B S$

  • assume $R$ is smaller than $S$

One-Pass Bag Intersection

Essentially same as #One-Pass Set Union

  • if $R$ is small enough to fit into $M - 1$ buffers
  • but for each distinct value we associate a count - number of times this tuple occurred
    • generally, this structure can take more that $M - 1$ memory buffer if there are few duplicates
    • but if there are a lot of duplicates - this way of organizing will take less room than $M - 1$


Algorithm

  • load $R$ to $K_R = N_1, ..., N_{B(R)}$ buffers
  • for each block $B_S \in S$
    • load $B_S$ to $N_0$
    • for each tuple $t_S \in B_S$
      • if $t_S \in K_R$
        • output $t_S$
        • decrease count of $t_S$ in $K_R$
        • if count = 0 then remove this tuple from memory


Sort-Based Bag Intersection


Set Difference

Note that this operation is not commutative

  • $S -_S R$ is not the same as $R -_S S$
  • assume that $R$ is smaller than $S$

One-Pass Set Difference

For this we assume $R$ fits into $M-1$ blocks


$S -_S R$ case

  • read $R$ into buffers $K_R = N_1, ..., N_{B(R)}$
  • for each block $B_S \in S$
    • load $B_S$ into $N_0$
    • for each tuple $t_S \in B_S$
      • if $t_S \not \in K_R$ output $t_S$


$R -_S S$ case

  • read $R$ into buffers $K_R = N_1, ..., N_{B(R)}$
  • for each block $B_S \in S$
    • load $B_S$ to $N_0$
    • for each tuple $t_S \in B_S$
      • if $t_S \in K_R$ remove $t_S$ from $K_R$
  • for each tuple $t_R$ that is still in $K_R$ output $t_R$

Sort-Based Set Difference


Bag Difference

  • same as in #Bag Intersection: we count the number of occurrences
  • also two cases: $S -_B R$ and $R -_B S$

One-Pass Bag Difference

For this we assume $R$ fits into $M-1$ blocks


$S -_B R$ case: count $c$ in this case is $c$ reasons not to output a tuple

  • read tuples of $R$ to $K_R = N_1, ..., N_{B(R)}$, count the number of occurrences
  • load each block $B_S$ to $N_0$
    • for each tuple $t_S \in B_S$
      • if $t_S \not \in K_R$: output $t_S$
      • otherwise
        • decrement count of $t_S$ and
        • remove $t_S$ from $K_S$ if count = 0


$R -_B S$ case

  • read tuples of $R$ to $K_R = N_1, ..., N_{B(R)}$, count the number of occurrences
  • load each $B_S$ to $N_0$
    • if $t_S \in K_R$
      • decrement count for $t_S$ in $K_R$
      • remove $t_S$ from $K_R$ if count = 0
  • for each remaining $t_R \in K_R$
    • output $t_R$ count times

Sort-Based Bag Difference

  • same as #Sort-Based Set Union
  • for each tuple $t$
    • let $c_R$ = number of times $t$ appears in $R$
    • let $c_S$ = number of times $t$ appears in $S$
    • if $c = c_R - c_S \leqslant 0$ don't output anything
    • otherwise output $t$ $c$ times


Join

Join is most costly operator to evaluate

  • sometimes it's quadratic (when it's equivalent to cartesian product)

Suppose we want to evaluate $R(X, Y) \Join S(Y, Z)$

  • $Y$ is a matching attribute and we will join on it
  • we again assume that $B(R) < B(S)$

One-Pass Join

To be able to do it in one pass $R$ must fit into memory

  • $R(B) \leqslant M -1$

Algo

  • load $R$ into buffers $N_1, ..., N_{B(R)}$
  • for each block $B_S \in S$
    • load $B_S$ to $N_0$
    • for each tuple $t_S \in N_0$
      • for each matching tuple $t_R \in \U_i N_i$
      • output $t_R \Join t_S$

Cost

  • $B(R) + B(S)$
  • again ignore cost of finding matching tuple in memory


Nested Loop Join

Tuple-Based Nested Join Loop

First variant is tuple-based nested join loop

  • for each $r \in R$
  • for each $s \in S$
    • if $r$ matches $s$ output $r \Join s$

We need only one buffer for $R$ and one buffer for $S$

Cost

  • $T(R) \times T(S)$ - very expensive!

Block-Based Nested Join Loop

  • We divide $R$ into segments of $M - 1$ blocks
  • and for each such segment we go through entire $S$

Algo

  • load each segment of $R$ into buffers $N_1, ..., N_{M - 1}$
  • for each block $B_S \in S$
    • load $B_S$ into $N_0$
    • for each tuple $t_R \in \cup_i N_i$
      • for each matching tuple $t_S \in N_0$: output $t_R \Join t_S$

Sort-Based Join

Essentially the same as #Sort-Based Set Union

  • but in this case we need to take care about duplicates that may be in both $R$ and $S$

Algo

  • Sort $R$ on matching attribute $Y$
  • Sort $S$ on matching attribute $Y$
  • Iterate Synchronously through $R$ and $S$
    • if $t_R.Y < t_S.Y$ then advance pointer $t_R$
    • if $t_R.Y > t_S.Y$ then advance pointer $t_S$
    • if $t_R.Y = t_S.Y$ then
      • for each pointers $t'_S$ with same $Y$ value that follow $t_S$ (including $t_S$ itself)
        • output $t_R \Join t'_S$
      • advance pointer $t_R$ and rewind $t'_S$ to $t_S$
      • (this way we join each tuple from $R$ that has value $Y$ with each tuple from $S$ that also has value $Y$)


Cost

  • usually depends of the number of tuples with equal values
  • worst case: all tuples have the same value for $Y$ - in this case the cost is $B(R) \times B(S)$
  • but joins are usually performed on foreign keys
    • i.e. tuples in $R$ have distinct values for $Y$
    • and for each tuple $t_R$ we have several (maybe 0) tuples $t_S$ - one-to-many relationship
    • so we don't need to rewind the pointer for $t_S$
    • in this case the cost analysis is similar to the #Sort-Based Set Union
      • sorting cost + $B(R) + B(S)$
    • it's also possible to optimize and save additional $B(R) + B(S)$ I/Os
      • sorting cost - $B(R) - B(S)$
  • NB: if there's a clustered index on $Y$ we don't need to sort it - it's already sorted


(Partition) Hash Join

  • Essentially the same as #Hash-Based Set Union
  • the only difference is that we hash the join attribute and not the whole tuple

Algo

  • partition $R$ by hashing $Y$ into buckets each with at most $M-1$ blocks
  • let $k$ be the number of buckets we got in result
  • partition $S$ by hashing $Y$ into $k$ buckets
  • let $R_i$ and $S_i$ be blocks of bucket #$i$ that ended up there because their $Y$ values have the same hash
    • a tuple $t_S \in S$ matches $t_R \in S$ $\iff$ there $\exists$ a bucket $i$ s.t. $t_R \in R_i$ and $t_S \in S_i$
  • we compute join by calculating $R_i \Join S_i$ for all $i$ using #One-Pass Join algorithm

Cost

  • same as for Hash-Based Set Union
  • $k = \lceil \log_{M - 1} B(R) - 1 \rceil$
  • total # of I/Os: $2B(R) \cdot k + 2B(S) \cdot k + B(R) + B(S)$


See also

Sources

Machine Learning Bookcamp: Learn machine learning by doing projects. Get 40% off with code "grigorevpc".

Share your opinion