TODO: see Scalable Data Analytics and Data Mining AIM3 (TUB) lectures

Canopy Clustering


  • sample a point
  • form a group around this point
    • other points that are within some similarity threshold
  • remove closest points


  • set of (potentially overlapping) groups
  • they are much smaller than the original dataset

canopies reduce the computation time:


Machine Learning Bookcamp: Learn machine learning by doing projects. Get 40% off with code "grigorevpc".

Share your opinion