Heap

a container that have keys

key property: at every node $x$

  • key[x] $\leqslant$ (or $\geqslant$) all keys of $x$'s children
  • therefore, the object at root must have min (max) value

operations

insert

  • adds new object
  • $O(\log n)$


extract-min(max)

  • extracts min (max) from heap
  • ties broken arbitrarily
  • $O(\log n)$


heapify

  • initialization: builds a heap


delete

  • $O(\log n)$ time


Implementation

  • it's a tree with [math]\approx \log_2 n[/math] levels
  • backed by array

Traversing the tree:

  • [math]\text{parent}(i) = i / 2[/math]
  • [math]\text{left}(i) = 2i[/math]
  • [math]\text{right}(i) = 2i + 1[/math]

insert(key $k$):

  • stick $k$ at the end of last level
  • bubble-up $k$ until heap property is restored

extract-min():

  • delete root
  • move last leaf to be new root
  • bubble-down until heap property is restored
  • (always swap with the smallest child)

Java implementation:

Applications

  • general: fast way to do repeated minimum (maximum) computations
  • priority queues, "event manager"

Heap sort

  • put everythin into heap
  • repeatedly extract-min until the heap is empty


Median maintenance

  • given: a sequence of numbers $x_1, ..., x_n$, one-by-one
  • goal: at each time step $i$, compute the median of $\{x_1, ..., x_i\}$
  • solution:
    • create two heaps:
      • $H_\text{low}$ (with extract-max operation),
      • $H_\text{high}$ (extract-min)
  • key idea: maintain invariant that $\approx \cfrac{i}{2}$ smallest (largest) numbers are in $H_\text{low}$ ($H_\text{high}$)
  • so on $20$th step, in $H_\text{low}$ would be $10$th order statistics, and in $H_\text{high}$ - $11$th
  • keep the heaps balanced! (so they have the same number of elements)


Implementation: [1]


Speeding up the Dijkstra's algorithm

  • naive $\Theta(nm)$
  • with heaps $O(m \log n)$


See also

Sources

Machine Learning Bookcamp: Learn machine learning by doing projects. Get 40% off with code "grigorevpc".

Share your opinion