ML Wiki
Machine Learning Wiki - A collection of ML concepts, algorithms, and resources.

Inversion Count

Inversion Count

Sequence inversion

  • In a sequence $\pi = \langle a_0, …, a_t \rangle$ or elements $A = { a_i }$
  • a pair $(a_i, a_j)$ is an ‘‘inversion’’ if $i < j \land a_i > a_j$
  • the number of such inversions is the inversion number of sequence $\pi$
  • this is a measure of “sortedness” of sequence $\pi$

Two ranked vectors

  • An inversion in two rankings $r_1, r_2$ of the same variable $X$ is
  • a pair $(x_i, x_j) \ \ r_1(x_i) < r_1(x_j) \land r_2(x_i) > r_2(x_j)$ - it’s called a ‘‘pair-wise disagreement’’ between two ranking lists

Graphical Counting

we can represent two rankings as a Bipartite Graph $G = \langle N, S, E \rangle$

  • $N = r_1(X)$ and $E = r_2(X)$ being two disjoint set of nodes
  • $X$ is some variable, and $r_1$ and $r_2$ are different rankings of this variable
  • $E$ is set of edges $E = \Big{ \big(r_1(x), r_2(x) \big) \Big} $ i.e. corresponding elements of $X$ are connected in this graph

Counting:

  • '’bilayer drawing’’ of $G$ is when there are two parallel lines, edges of $N$ are drawn on one, and edges of $S$ are drawn on another
  • '’bilayer cross count’’ is a pairwise intersections edges of $N$ and $S$
  • bilayer cross count corresponds to the number of inversions when $N$ and $S$ are ranking vectors

Example:

  • $X = { A, B, C, D, E }$
  • two ranking $r_1 = \langle E, B, A, C, D \rangle$ and $r_2 = \langle B, E, C, D, A \rangle$
  • draw this is a bipartite graph and count the number of intersections
  • Image
  • so there are 3 inversions in these two rankings

Algorithms

A modification of Merge Sort can compute the # of inversions in $O(| N| \log |N|)$ |- see Merge Sort#Counting Inversions

See Also

Sources