Concept Decomposition
We can use clustering for Dimensionality Reduction of text data
- both Term Clustering and Document Clustering technique at the same time
do clustering
- most frequent terms in the centroids are the basis
- they are almost orthogonal - they shouldn’t appear a lot in other clusters
- then represent each document in terms of this basis
References
- Dhillon, Inderjit S., and Dharmendra S. Modha. “Concept decompositions for large sparse text data using clustering.” (2001). link
- https://www.google.ru/?q=Concept+Decomposition
Sources
- Aggarwal, Charu C., and ChengXiang Zhai. “A survey of text clustering algorithms.” Mining Text Data. Springer US, 2012. link