Line 178: Line 178:
  
  
== $i$th order statistics ==
+
== Quick Select ==
problem
+
Quickselect is an algorithm for computing $i$th order statistics
 +
 
 +
Problem
 
* input: given $i$th element and array $A$
 
* input: given $i$th element and array $A$
 
* goal: find $i$th order statistics (i.e. $i$th smallest element)
 
* goal: find $i$th order statistics (i.e. $i$th smallest element)
Line 213: Line 215:
 
* if $j < i$
 
* if $j < i$
 
** return RSelect($R$ side of $A$, $n-j$, $i-j$)
 
** return RSelect($R$ side of $A$, $n-j$, $i-j$)
 
  
 
Best pivot - the median
 
Best pivot - the median
Line 219: Line 220:
 
* $T(n) = O(n)$
 
* $T(n) = O(n)$
  
=== Implementation ===
+
 
 +
==== Implementation ====
  
 
  def r_select(A, l, r, i):
 
  def r_select(A, l, r, i):
Line 238: Line 240:
 
     return r_select(A.copy(), 0, len(A), i)
 
     return r_select(A.copy(), 0, len(A), i)
  
If there are repeating elements, use HoarePartition instead:
 
  
  def hoare_r_select(A, l, r, i):
+
=== Repeating Elements ===
    n = r - l
+
If there are repeating elements:
     if n == 0:
+
* the HoarePartition algorithm (as it is implemented above) does not work well with such RSelect:
         return A[l]
+
* It does not place the pivot on its position, unlike the Partition algorithm
+
* It only ensures the invariant that left is $\leqslant p$ and right is $\geqslant p$.
     pi = hoare_partition(A, l, r)
+
 
     if i == pi:
+
Solution:
         return A[pi]
+
* We do "partial QuickSort": sort only the part where $i$ is
    if pi < i:
+
* After the input is partially sorted, we know that the $i$th element is correctly placed
        return hoare_r_select(A, pi + 1, r, i)
+
* So we just take it
 +
 
 +
==== Implementation ====
 +
 
 +
  def partial_quicksort(A, l, r, i):
 +
     if l >= r:
 +
         return
 +
     q = hoare_partition(A, l, r)
 +
     if i <= q:
 +
         partial_quicksort(A, l, q, i)
 
     else:
 
     else:
         return hoare_r_select(A, l, pi, i)
+
         partial_quicksort(A, q + 1, r, i)
 
   
 
   
 
  def ith_order_statistic(A, i):
 
  def ith_order_statistic(A, i):
     return hoare_r_select(A.copy(), 0, len(A) - 1, i)
+
     A = A.copy()
 +
    partial_quicksort(A, 0, len(A) - 1, i)
 +
    return A[i]
 +
 
  
 
== See also ==
 
== See also ==
Line 263: Line 276:
 
* [[Algorithms Design and Analysis Part 1 (coursera)]]
 
* [[Algorithms Design and Analysis Part 1 (coursera)]]
 
* https://stackoverflow.com/questions/40368543/quicksort-hoares-partitioning-with-duplicate-values
 
* https://stackoverflow.com/questions/40368543/quicksort-hoares-partitioning-with-duplicate-values
 +
* https://en.wikipedia.org/wiki/Quickselect
 +
* https://oneraynyday.github.io/algorithms/2016/06/16/QuickSelectSort/
  
 
[[Category:Algorithms]]
 
[[Category:Algorithms]]
 
[[Category:Sorting]]
 
[[Category:Sorting]]

Revision as of 14:33, 22 April 2018

Quick Sort

Sorting algorithm based on Divide and Conquer computational paradigm

Advantages:

  • $O(n \log n)$
  • operates at place

Main idea - partition around a pivot:

  • pick an element
  • rearrange array so
    • left from pivot => less than pivot
    • right from pivot => greater than pivot

done in $O(n)$ time

partition:

Algorithm

QuickSort(array $A$):

  • $n = \text{len}(A)$
  • if $n = 1$ return
  • $p$ = ChosePivot($A$)
  • partition $A$ around $p$
  • QuickSort($A$ left from $p$)
  • QuickSort($A$ right from $p$)

Partition($A$):

  • // input: $A[l..r]$
  • $p = A[l]$
  • $i = i + 1$
  • for $j = l + 1$ to $r$
    • if $A[j] < p$
      • swap $j$ and $i$
      • $i = i + 1$
  • swap $l$ and $i - 1$

running time $O(n)$ where $n = r - l + 1$

Pivot

Quality

  • Running time depends on the quality of pivot
  • good quality - always divides into 2 equal halves (matches the median)
  • i.e. leads to $\Theta(n)$
  • Random pivot is good enough on average cases

So, possible pivots

  • 1st element
  • last element
  • random

Implementation

Java

public void qsort(int[] input, int left, int right) {
    int n = right - left;
    if (n <= 1) {
        return;
    }

    int pivotIndex = partition(input, left, right);

    qsort(input, left, pivotIndex);
    qsort(input, pivotIndex + 1, right);
}

public int partition(int[] input, int left, int right) {
    int pivot = input[left];
    int i = left + 1;

    for (int j = left + 1; j < right; j++) {
        if (input[j] < pivot) {
            swap(input, j, i);
            i++;
        }
    }

    swap(input, left, i - 1);
    return i - 1;
}

public void swap(int[] input, int j, int i) {
    int tmp = input[j];
    input[j] = input[i];
    input[i] = tmp;
}

Python

def pivot(A, l, r):
    return l

def partition(A, l, r):
    pi = pivot(A, l, r)
    p = A[pi]
    A[l], A[pi] = A[pi], A[l]

    i = l
    
    for j in range(i + 1, r):
        if A[j] < p:
            i = i + 1
            A[i], A[j] = A[j], A[i]
    
    A[l], A[i] = A[i], A[l]
    return i

def qs(A, l, r):
    n = r - l

    if n <= 1:
        return

    i = partition(A, l, r)
    
    qs(A, l, i)
    qs(A, i + 1, r)

def quicksort(A):
    return qs(A, 0, len(A))

Hoare Partition

QuickSort implementation from above does not behave well when there repeating elements.

Hoare Partitioning scheme fixes it:

  • Idea: for pivot $p$ split array $A$ into two parts: "$\leqslant p$" and "$\geqslant p$" (not "$>p$" and "$<p$" like previously)
  • Previously we moved both indices $i$ and $j$ from left to right. Now move $i$ from left, and $j$ from right

HoarePartition($A$, $l$, $r$):

  • choose pivot $p$ (e.g. $p = A[l]$)
  • $i = l$, $j = r$
  • repeat:
    • while $A[i] < p$, increment $i$
    • while $A[j] > p$, decrement $j$
    • if $i \geqslant j$, stop and return $j$
    • swap $A[i]$ and $A[j]$
    • increment $i$, decrement $j$

The QuickSort procedure stays the same

Implementation

def hoare_partition(A, l, r):
    pi = pivot(A, l, r)
    p = A[pi]

    i = l
    j = r
    
    while True:
        while A[i] < p:
            i = i + 1

        while A[j] > p:
            j = j - 1
        
        if i >= j:
            return j
        
        A[i], A[j] = A[j], A[i]
        i = i + 1
        j = j - 1


def qs(A, l, r):
    if l >= r:
       return
    q = hoare_partition(A, l, r)
    qs(A, l, q)
    qs(A, q + 1, r)

def quicksort(A):
    return qs(A, 0, len(A) - 1)


Quick Select

Quickselect is an algorithm for computing $i$th order statistics

Problem

  • input: given $i$th element and array $A$
  • goal: find $i$th order statistics (i.e. $i$th smallest element)


Reduction to sorting

  • $O(n \log n)$
  • apply merge sort
  • return $i$th element
  • can we do better? yes!


modification for QuickSort:

  • recall Partition procedure
  • pivot is on its position!


how to find $i$th order?

  • suppose need 5th element in $A$ of len 10
  • after partition, pivot in on 3rd position
  • so we need 2nd (5-3) statistics on the $R$ side


Algorithm

RSelect(array $A$, len $n$, order $i$)

  • if $n = 1$ return $A[1]$
  • choose pivot $p$ from $A$ at random
  • partition $A$ around $p$
  • $j$ = new index of $p$
  • if $j = i$: return $p$ // lucky case
  • if $j > i$
    • return RSelect($L$ side of $A$, $j-1$, $i$)
  • if $j < i$
    • return RSelect($R$ side of $A$, $n-j$, $i-j$)

Best pivot - the median

  • $T(n) \leqslant T(\frac{n}{2}) + O(n)$
  • $T(n) = O(n)$


Implementation

def r_select(A, l, r, i):
    n = r - l
    if n == 0:
        return A[l]

    pi = partition(A, l, r)

    if i == pi:
        return A[pi]
    if pi < i:
        return r_select(A, pi + 1, r, i)
    else:
        return r_select(A, l, pi, i)

def ith_order_statistic(A, i):
    return r_select(A.copy(), 0, len(A), i)


Repeating Elements

If there are repeating elements:

  • the HoarePartition algorithm (as it is implemented above) does not work well with such RSelect:
  • It does not place the pivot on its position, unlike the Partition algorithm
  • It only ensures the invariant that left is $\leqslant p$ and right is $\geqslant p$.

Solution:

  • We do "partial QuickSort": sort only the part where $i$ is
  • After the input is partially sorted, we know that the $i$th element is correctly placed
  • So we just take it

Implementation

def partial_quicksort(A, l, r, i):
    if l >= r:
        return
    q = hoare_partition(A, l, r)
    if i <= q:
        partial_quicksort(A, l, q, i)
    else:
        partial_quicksort(A, q + 1, r, i)

def ith_order_statistic(A, i):
    A = A.copy()
    partial_quicksort(A, 0, len(A) - 1, i)
    return A[i]


See also

Sources