Line 238: Line 238:
 
     return r_select(A.copy(), 0, len(A), i)
 
     return r_select(A.copy(), 0, len(A), i)
  
If there are repeating elements, we need to use HoarePartition instead  
+
If there are repeating elements, use HoarePartition instead:
 +
 
 +
def hoare_r_select(A, l, r, i):
 +
    n = r - l
 +
    if n == 0:
 +
        return A[l]
 +
 +
    pi = hoare_partition(A, l, r)
 +
    if i == pi:
 +
        return A[pi]
 +
    if pi < i:
 +
        return hoare_r_select(A, pi + 1, r, i)
 +
    else:
 +
        return hoare_r_select(A, l, pi, i)
 +
 +
def ith_order_statistic(A, i):
 +
    return hoare_r_select(A.copy(), 0, len(A) - 1, i)
  
 
== See also ==
 
== See also ==

Revision as of 13:50, 21 April 2018

Quick Sort

Sorting algorithm based on Divide and Conquer computational paradigm

Advantages:

  • $O(n \log n)$
  • operates at place

Main idea - partition around a pivot:

  • pick an element
  • rearrange array so
    • left from pivot => less than pivot
    • right from pivot => greater than pivot

done in $O(n)$ time

partition:

Algorithm

QuickSort(array $A$):

  • $n = \text{len}(A)$
  • if $n = 1$ return
  • $p$ = ChosePivot($A$)
  • partition $A$ around $p$
  • QuickSort($A$ left from $p$)
  • QuickSort($A$ right from $p$)

Partition($A$):

  • // input: $A[l..r]$
  • $p = A[l]$
  • $i = i + 1$
  • for $j = l + 1$ to $r$
    • if $A[j] < p$
      • swap $j$ and $i$
      • $i = i + 1$
  • swap $l$ and $i - 1$

running time $O(n)$ where $n = r - l + 1$

Pivot

Quality

  • Running time depends on the quality of pivot
  • good quality - always divides into 2 equal halves (matches the median)
  • i.e. leads to $\Theta(n)$
  • Random pivot is good enough on average cases

So, possible pivots

  • 1st element
  • last element
  • random

Implementation

Java

public void qsort(int[] input, int left, int right) {
    int n = right - left;
    if (n <= 1) {
        return;
    }

    int pivotIndex = partition(input, left, right);

    qsort(input, left, pivotIndex);
    qsort(input, pivotIndex + 1, right);
}

public int partition(int[] input, int left, int right) {
    int pivot = input[left];
    int i = left + 1;

    for (int j = left + 1; j < right; j++) {
        if (input[j] < pivot) {
            swap(input, j, i);
            i++;
        }
    }

    swap(input, left, i - 1);
    return i - 1;
}

public void swap(int[] input, int j, int i) {
    int tmp = input[j];
    input[j] = input[i];
    input[i] = tmp;
}

Python

def pivot(A, l, r):
    return l

def partition(A, l, r):
    pi = pivot(A, l, r)
    p = A[pi]
    A[l], A[pi] = A[pi], A[l]

    i = l
    
    for j in range(i + 1, r):
        if A[j] < p:
            i = i + 1
            A[i], A[j] = A[j], A[i]
    
    A[l], A[i] = A[i], A[l]
    return i

def qs(A, l, r):
    n = r - l

    if n <= 1:
        return

    i = partition(A, l, r)
    
    qs(A, l, i)
    qs(A, i + 1, r)

def quicksort(A):
    return qs(A, 0, len(A))

Hoare Partition

QuickSort implementation from above does not behave well when there repeating elements.

Hoare Partitioning scheme fixes it:

  • Idea: for pivot $p$ split array $A$ into two parts: "$\leqslant p$" and "$\geqslant p$" (not "$>p$" and "$<p$" like previously)
  • Previously we moved both indices $i$ and $j$ from left to right. Now move $i$ from left, and $j$ from right

HoarePartition($A$, $l$, $r$):

  • choose pivot $p$ (e.g. $p = A[l]$)
  • $i = l$, $j = r$
  • repeat:
    • while $A[i] < p$, increment $i$
    • while $A[j] > p$, decrement $j$
    • if $i \geqslant j$, stop and return $j$
    • swap $A[i]$ and $A[j]$
    • increment $i$, decrement $j$

The QuickSort procedure stays the same

Implementation

def hoare_partition(A, l, r):
    pi = pivot(A, l, r)
    p = A[pi]

    i = l
    j = r
    
    while True:
        while A[i] < p:
            i = i + 1

        while A[j] > p:
            j = j - 1
        
        if i >= j:
            return j
        
        A[i], A[j] = A[j], A[i]
        i = i + 1
        j = j - 1


def qs(A, l, r):
    if l >= r:
       return
    q = hoare_partition(A, l, r)
    qs(A, l, q)
    qs(A, q + 1, r)

def quicksort(A):
    return qs(A, 0, len(A) - 1)


$i$th order statistics

problem

  • input: given $i$th element and array $A$
  • goal: find $i$th order statistics (i.e. $i$th smallest element)


Reduction to sorting

  • $O(n \log n)$
  • apply merge sort
  • return $i$th element
  • can we do better? yes!


modification for QuickSort:

  • recall Partition procedure
  • pivot is on its position!


how to find $i$th order?

  • suppose need 5th element in $A$ of len 10
  • after partition, pivot in on 3rd position
  • so we need 2nd (5-3) statistics on the $R$ side


Algorithm

RSelect(array $A$, len $n$, order $i$)

  • if $n = 1$ return $A[1]$
  • choose pivot $p$ from $A$ at random
  • partition $A$ around $p$
  • $j$ = new index of $p$
  • if $j = i$: return $p$ // lucky case
  • if $j > i$
    • return RSelect($L$ side of $A$, $j-1$, $i$)
  • if $j < i$
    • return RSelect($R$ side of $A$, $n-j$, $i-j$)


Best pivot - the median

  • $T(n) \leqslant T(\frac{n}{2}) + O(n)$
  • $T(n) = O(n)$

Implementation

def r_select(A, l, r, i):
    n = r - l
    if n == 0:
        return A[l]

    pi = partition(A, l, r)

    if i == pi:
        return A[pi]
    if pi < i:
        return r_select(A, pi + 1, r, i)
    else:
        return r_select(A, l, pi, i)

def ith_order_statistic(A, i):
    return r_select(A.copy(), 0, len(A), i)

If there are repeating elements, use HoarePartition instead:

def hoare_r_select(A, l, r, i):
    n = r - l
    if n == 0:
        return A[l]

    pi = hoare_partition(A, l, r)
    if i == pi:
        return A[pi]
    if pi < i:
        return hoare_r_select(A, pi + 1, r, i)
    else:
        return hoare_r_select(A, l, pi, i)

def ith_order_statistic(A, i):
    return hoare_r_select(A.copy(), 0, len(A) - 1, i)

See also

Sources