ML Wiki
Machine Learning Wiki - A collection of ML concepts, algorithms, and resources.

Canopy Clustering

Canopy Clustering

repeat

  • sample a point
  • form a group around this point
    • other points that are within some similarity threshold
  • remove closest points

result

  • set of (potentially overlapping) groups
  • they are much smaller than the original dataset

canopies reduce the computation time:

  • http://en.wikipedia.org/wiki/Canopy_clustering_algorithm
  • http://www.kamalnigam.com/papers/canopy-kdd00.pdf