ML Wiki

Machine Learning Wiki - A collection of ML concepts, algorithms, and resources.

Concept Decomposition

cluster-analysis dimensionality-reduction document-clustering

Concept Decomposition

We can use clustering for Dimensionality Reduction of text data

both Term Clustering and Document Clustering technique at the same time

do clustering

most frequent terms in the centroids are the basis
they are almost orthogonal - they shouldn’t appear a lot in other clusters
then represent each document in terms of this basis

References

Dhillon, Inderjit S., and Dharmendra S. Modha. “Concept decompositions for large sparse text data using clustering.” (2001). [http://www.cs.utexas.edu/users/inderjit/public_papers/concept_mlj.pdf]
https://www.google.ru/?q=Concept+Decomposition

Sources

Aggarwal, Charu C., and ChengXiang Zhai. “A survey of text clustering algorithms.” Mining Text Data. Springer US, 2012. [http://ir.nmu.org.ua/bitstream/handle/123456789/144935/d1784ebed3eab2708026b202b2b65309.pdf?sequence=1#page=90]

✏️ Edit on GitHub