ML Wiki
Machine Learning Wiki - A collection of ML concepts, algorithms, and resources.

CURE Clustering

CURE Algorithm for Clustering

Use a set of representative points to find non-global clusters

  • these points capture the geometry and shape of clusters

Choose points

  • 2 farthest away points
  • 3 and so on - furthest away from previous ones
  • this procedure guarantees that the points are well distributed

Then shrink the points towards the centroids by factor of $\alpha$

CURE eliminates outliers by discarding small slowly growing clusters

  • but it has a notion of center - not all shapes has natural center

== References ==

  • Guha, Sudipto, Rajeev Rastogi, and Kyuseok Shim. “Cure: an efficient clustering algorithm for large databases.” (2001) [http://www.cs.sfu.ca/CourseCentral/459/han/papers/guha98.pdf]

Sources

  • Ertöz, Levent, Michael Steinbach, and Vipin Kumar. “Finding clusters of different sizes, shapes, and densities in noisy, high dimensional data.” 2003. [http://static.msi.umn.edu/rreports/2003/73.pdf]
  • http://en.wikipedia.org/wiki/CURE_data_clustering_algorithm