Sources Index
Only papers I read and used as sources (or small books that don't deserve a separate wiki page)
- ordered by first author
- ABCDEFGHIJKLMNOPQRSTUVWXYZ
A
- Aggarwal, Charu C., and ChengXiang Zhai. "A survey of text clustering algorithms." Mining Text Data. 2012. Document Clustering, K-Means, K-Medoids, Co-Clustering, Two-Phase Document Clustering, Non-Negative Matrix Factorization, Semi-Supervised Clustering, Topic Models, Probabilistic LSA, Term Strength, Term Contribution, Stop Words
B
C
D
E
- Elsayed, Tamer, Jimmy Lin, and Douglas W. Oard. "Pairwise document similarity in large collections with MapReduce." 2008. [6] Inverted Index
- Ertöz, Levent et al. "Finding clusters of different sizes, shapes, and densities in noisy, high dimensional data." 2003. [7] Document Clustering, DBSCAN, SNN Clustering, Euclidean Distance, Curse of Dimensionality, Chameleon Clustering, CURE Clustering, ROCK Clustering
F
G
H
- Hopcroft, John, and Ravindran Kannan. "Foundations of Data Science1." 2014. Power Iteration
I
J
K
- Kalman, Dan. "A singularly valuable decomposition: the SVD of a matrix." 1996. [11] SVD
- Koll, Matthew B. "WEIRD: An approach to concept-based information retrieval." 1979. Latent Semantic Analysis
- Korenius, Tuomo, Jorma Laurikkala, and Martti Juhola. "On principal component analysis, cosine and Euclidean measures in information retrieval." 2007. [12] Principal Component Analysis, Latent Semantic Analysis, Distance Functions, Cosine Similarity, Euclidean Distance
- Kristianto, et al. "Extracting definitions of mathematical expressions in scientific papers." 2012. [13] Mathematical Definition Extraction, Math-Aware POS Tagging
- Kristianto, et al. "Extracting Textual Descriptions of Mathematical Expressions in Scientific Papers." 2014. [14] Mathematical Definition Extraction
L
- Landauer, T. et al. "An introduction to latent semantic analysis." 1998. [15] Latent Semantic Analysis
- Larsen, Bjornar et al. "Fast and effective text mining using linear-time document clustering." 1999. [16] Document Clustering
- Lee, K., Lee, Y. et al "Parallel data processing with MapReduce: a survey" 2012. [17] Hadoop, MapReduce, Hadoop MapReduce
- Lee K., Jalali A., Dasdan A. "Real time bid optimization with smooth budget delivery in online advertising", 2013. [18] Budget Pacing
- Li, Yong H., et al. "Classification of text documents." 1998. [19] Term Clustering
- Liu, Tao, et al. "An evaluation on feature selection for text clustering." 2003. [20] Term Contribution
M
Manning C., Schütze H. "Foundations of statistical natural language processing", 1999. Collocation Extraction
N
O
P
Q
R
S
- Salton, et al. "A vector space model for automatic indexing." 1975. [27] Vector Space Model
- Salton, Buckley. "Term-weighting approaches in automatic text retrieval." 1988. [28] TF-IDF
- Schelter, Sebastian, et al. "Efficient Sample Generation for Scalable Meta Learning." [29]. 2014. Meta Learning
- Schöneberg et al. "POS Tagging and its Applications for Mathematics." 2014. Math-Aware POS Tagging
- Sculley, David. "Web-scale k-means clustering." 2010. [30] K-Means
- Sebastiani, Fabrizio. "Machine learning in automated text categorization." 2002. [31] Document Classification, Term Clustering
- Slaney, Malcolm, and Michael Casey. "Locality-sensitive hashing for finding nearest neighbors [lecture notes]." 2008. [32] Locality Sensitive Hashing, Euclidean LSH
- Steinbach, Michael, et al. "A comparison of document clustering techniques." 2000. Document Clustering, K-Means
- Strang, Gilbert. "The fundamental theorem of linear algebra." 1993. [33] SVD
T
U
V
W
X
Y
Z