17. Jain, A. K. (2010). Data clustering: 50 years beyond K-means. Pattern Recognition Letters, 31(8), 651-
666.
18. Sculley, D. (2010). Web-scale k-means clustering. In Proceedings of the 19th International Conference
on World Wide Web (pp. 1177-1178).
19. Zaharia, M., Chowdhury, M., Franklin, M. J., Jordan, M. I., & Stoica, I. (2016). Spark: Cluster
computing with working sets. Hot Topics in Cloud Computing.
20. Everitt, B. S., Landau, S., Leese, M., & Stahl, D. (2011). Cluster Analysis. John Wiley & Sons.
21. Zhang, T., Ramakrishnan, R., & Livny, M. (1996). BIRCH: An efficient data clustering method for very
large databases. ACM SIGMOD Record, 25(2), 103-114.
22. McInnes, L., Healy, J., & Astels, S. (2017). HDBSCAN: Hierarchical density-based spatial clustering
of applications with noise. The Journal of Open Source Software, 2(11), 205.
23. Ester, M., Kriegel, H. P., Sander, J., & Xu, X. (1996). A density-based algorithm for discovering clusters
in large spatial databases with noise. In Proceedings of the Second International Conference on
Knowledge Discovery and Data Mining (pp. 226-231).
24. Wang, W., Yang, J., & Muntz, E. (1997). STING: A statistical information grid approach to spatial data
mining. In Proceedings of the 23rd International Conference on Very Large Data Bases (pp. 186-195).
25. Agrawal, R., Gehrke, J., Gunopulos, D., & Raghavan, P. (1998). Automatic subspace clustering of high
dimensional data for data mining applications. In ACM SIGMOD Record 27(2), 94-105.
26. McLachlan, G., & Peel, D. (2000). Finite Mixture Models. John Wiley & Sons.
27. Han, J., Pei, J., & Yin, Y. (2000). Mining frequent patterns without candidate generation. ACM
SIGMOD Record, 29(2), 1-12.
28. Wagstaff, K., Cardie, C., Rogers, S., & Schroedl, S. (2001). Constrained k-means clustering with
background knowledge. In Proceedings of the Eighteenth International Conference on Machine
Learning (pp. 577-584).
29. Johnson, S. C. (1967). Hierarchical clustering schemes. Psychometrika, 32(3), 241-254.
30. Hartigan, J. A. (1975). Clustering Algorithms. John Wiley & Sons.
31. Pelleg, D., & Moore, A. W. (2000). X-means: Extending K-means with efficient estimation of the
number of clusters. In Proceedings of the 17th International Conference on Machine Learning (pp. 727-
734).
32. Arthur, D., & Vassilvitskii, S. (2007). k-means++: The advantages of careful seeding. In Proceedings
of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms (pp. 1027-1035).
33. Paparrizos, J., Das, A., & Lee, C. (2024). Contrastive learning for unsupervised clustering. Journal of
Machine Learning Research, 25, 1-28.
34. MacQueen, J. B. (1967). Some methods for classification and analysis of multivariate observations. In
Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability (Vol. 1, pp.
281-297).
35. Ankerst, M., Breunig, M. M., Kriegel, H. P., & Sander, J. (1999). OPTICS: Ordering points to identify
the clustering structure. In Proceedings of the 1999 ACM SIGMOD International Conference on
Management of Data (pp. 49-60).
36. Duda, R. O., Hart, P. E., & Stork, D. G. (2001). Pattern Classification. Wiley.
37. Jolliffe, I., & Cadima, J. (2016). Principal component analysis: A review and recent developments.
Philosophical Transactions of the Royal Society , 374(2065), 20150202.
38. Catanzaro, B., Cantin, J., & Keutzer, K. (2008). Fast, parallel k-means using GPU hardware. In
Proceedings of the 2008 Joint Conference on Learning and Intelligent Optimization (pp. 177-185).
Page 11