ML Wiki
Machine Learning Wiki - A collection of ML concepts, algorithms, and resources.

K-Medoids

K-Medoids

K-Medoids is a variation of K-Means clustering algorithm

Algorithm:

  • use a set of points from the original data set as anchors (“medoids”)
  • then build clusters around them
  • each item is assigned to its closest representation from the data set
  • iterative approach

Objective

  • $J$: average similarity of each item to its centroid

Disadvantages

  • require many iterations to converge
  • so it’s slow: it’s slow to compute $J$
  • doesn’t always work well for sparse data
    • e.g. for text, not many docs have lots of terms in common
    • so similarities between such pairs are small and noisy
    • a single medoid may not contain all needed information to build a cluster around it

Sources

  • http://en.wikipedia.org/wiki/K-medoids
  • Aggarwal, Charu C., and ChengXiang Zhai. “A survey of text clustering algorithms.” Mining Text Data. Springer US, 2012. [http://ir.nmu.org.ua/bitstream/handle/123456789/144935/d1784ebed3eab2708026b202b2b65309.pdf?sequence=1#page=90]