ML Wiki
Machine Learning Wiki - A collection of ML concepts, algorithms, and resources.

Concept Decomposition

Concept Decomposition

We can use clustering for Dimensionality Reduction of text data

do clustering

  • most frequent terms in the centroids are the basis
  • they are almost orthogonal - they shouldn’t appear a lot in other clusters
  • then represent each document in terms of this basis

References

  • Dhillon, Inderjit S., and Dharmendra S. Modha. “Concept decompositions for large sparse text data using clustering.” (2001). [http://www.cs.utexas.edu/users/inderjit/public_papers/concept_mlj.pdf]
  • https://www.google.ru/?q=Concept+Decomposition

Sources

  • Aggarwal, Charu C., and ChengXiang Zhai. “A survey of text clustering algorithms.” Mining Text Data. Springer US, 2012. [http://ir.nmu.org.ua/bitstream/handle/123456789/144935/d1784ebed3eab2708026b202b2b65309.pdf?sequence=1#page=90]