ML Wiki
Machine Learning Wiki - A collection of ML concepts, algorithms, and resources.

Distance Functions

Distance

A metric function (or distance) is a generalization of geometric distance (i.e. Euclidean Distance)

Direct similarity measures are not always reliable for high-dimensional clustering (see Guha1999)

Similarity is the opposite of distance

Non-metric

Resources

  • Strehl, Alexander, Joydeep Ghosh, and Raymond Mooney. “Impact of similarity measures on web-page clustering.” 2000. [http://strehl.com/download/strehl-aaai00.pdf]
  • Guha, Sudipto, Rajeev Rastogi, and Kyuseok Shim. “ROCK: A robust clustering algorithm for categorical attributes.” 1999. [http://www.cacs.louisiana.edu/~jyoon/grad/adb/References/clustering/ROCK-clus99icde.pdf]

Source

  • http://en.wikipedia.org/wiki/Metric_%28mathematics%29