K-Medoids
K-Medoids is a variation of K-Means clustering algorithm
Algorithm:
- use a set of points from the original data set as anchors ("medoids")
- then build clusters around them
- each item is assigned to its closest representation from the data set
- iterative approach
Objective
- $J$: average similarity of each item to its centroid
Disadvantages
- require many iterations to converge
- so it's slow: it's slow to compute $J$
- doesn't always work well for sparse data
- e.g. for text, not many docs have lots of terms in common
- so similarities between such pairs are small and noisy
- a single medoid may not contain all needed information to build a cluster around it
Sources