Chameleon Clustering

Combines initial partition of data with hierarchical clustering techniques it modifies clusters dynamically


  • Generate a KNN graph
  • because it's local, it reduces influence of noise and outliers
  • provides automatic adjustment for densities


  • use METIS: a graph partitioning algorithm
  • get equally-sized groups of well-connected vertices
  • this produces "sub-clusters" - something that is a part of true clusters


  • recombine sub-clusters
  • combine two clusters if
    • they are relatively close
    • they are relatively interconnected
  • so they are merged only if the new cluster will be similar to the original ones
  • i.e. when "self-similarity" is preserved (similar to the join operation in Scatter/Gather)


  • Curse of Dimensionality makes similarity functions behave poorly
  • distances become more uniform as dimensionality grows
  • and this makes clustering difficult

Similarity between two point of high dimensionality can be misleading

  • often points may be similar even though they should belong to different clusters



  • Karypis, George, Eui-Hong Han, and Vipin Kumar. "Chameleon: Hierarchical clustering using dynamic modeling." (1999). [1]


  • Ertöz, Levent, Michael Steinbach, and Vipin Kumar. "Finding clusters of different sizes, shapes, and densities in noisy, high dimensional data." 2003. [2]
Share your opinion