Document Type : Review Article

Authors

Department of Electrical Engineering, Majlesi Branch, Islamic Azad University, Isfahan, Iran.

Abstract

Ensemble Clustering (EC) methods became more popular in recent years. In this methods, some primary clustering algorithms are considered to be as inputs and a single cluster is generated to achieve the best results combined with each other. In this paper, we considered three hierarchical methods, which are single-link, average-link, and complete-link as the primary clustering and the results were combined with each other. This combination was done based on correlation matrix. The basic algorithms were combined as binary and triplicate and the results were evaluated as well. the IMDB film dataset were clustered based on existing features. CH, Silhouette and Dunn Index criteria were used to evaluate the results. These criteria evaluate the clustering quality by calculating intra-cluster and inter-cluster distances. CH index had the highest value when all three basic clusters are combined. our method shows that EC can achieve better results and present clusters with higher robustness and accuracy.

Keywords

[1] A. Norouzi et al., "Medical image segmentation methods, algorithms, and applications," IETE Technical Review, vol. 31, no. 3, pp. 199-213, 2014.
[2] S. N. Ghaemi Reza, Ibrahim Hamidah, Norwati Mustapha, "A Survey: Clustering Ensembles Techniques," 2009.
[3] A. Topchy, A. K. Jain, and W. Punch, "Clustering ensembles: models of consensus and weak partitions," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 12, pp. 1866-1881, 2005.
[4] J. G. Alexander Strehl, "Cluster Ensembles { A Knowledge Reuse Framework for
Combining Multiple Partitions," Journal of Machine Learning Research 3 (2002) 583-617, 2002.
[5] Y. Wendong, L. Hong, P. Na, and L. Zhenzhen, "Social media user partitioning based on ensemble clustering," in Service Systems and Service Management (ICSSSM), 2016 13th International Conference on, 2016, pp. 1-6: IEEE.
[6] E. Dimitriadou, A. Weingessel, and K. Hornik, "Voting-Merging: An Ensemble Method for Clustering," in Artificial Neural Networks — ICANN 2001: International Conference Vienna, Austria, August 21–25, 2001 Proceedings, G. Dorffner, H. Bischof, and K. Hornik, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2001, pp. 217-224.
[7] K. Tumer and A. K. Agogino, "Ensemble clustering with voting active clusters," Pattern Recognition Letters, vol. 29, no. 14, pp. 1947-1953, 2008.
[8] N. Nguyen and R. Caruana, "Consensus Clusterings," in Seventh IEEE International Conference on Data Mining (ICDM 2007), 2007, pp. 607-612.
[9] S. VEGA-PONS and J. RUIZ-SHULCLOPER, "A Survey of Clustering Ensemble Algorithms," International Journal of Pattern Recognition and Artificial Intelligence, vol. 25, no. 03, pp. 337-372, 2011.
[10] A. Strehl and J. Ghosh, "Cluster ensembles --- a knowledge reuse framework for combining multiple partitions," J. Mach. Learn. Res., vol. 3, pp. 583-617, 2003.
[11] D. Huang, J. Lai, and C.-D. Wang, "Ensemble clustering using factor graph," Pattern Recognition, vol. 50, pp. 131-142, 2016.
[12] L. Zheng, T. Li, and C. Ding, "A framework for hierarchical ensemble clustering," ACM Transactions on Knowledge Discovery from Data (TKDD), vol. 9, no. 2, p. 9, 2014.
[13] Y. Ren, C. Domeniconi, G. Zhang, and G. Yu, "Weighted-object ensemble clustering: methods and analysis," Knowledge and Information Systems, vol. 51, no. 2, pp. 661-689, 2017.
[14] H. M. Aristides Gionis, and Panayiotis Tsaparas, "Clustering Aggregation," ACM Transactions on Knowledge Discovery from Data (TKDD) 2007.
[15] K. M. Han Jiawei, Pei Jian, Data Mining Concepts and Techniques, 3 ed. 2012.
[16] A. L. N. Fred and A. K. Jain, "Combining multiple clusterings using evidence accumulation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 6, pp. 835-850, 2005.
[17] Y. Liu, Z. Li, H. Xiong, X. Gao, and J. Wu, "Understanding of Internal Clustering Validation Measures," in 2010 IEEE International Conference on Data Mining, 2010, pp. 911-916.
[18] F. Kovacs, C. Legany, and A. Babos, "Cluster validity measurement techniques," in 6th International symposium of hungarian researchers on computational intelligence, 2005: Citeseer.
[19] Z. Ansari, M. Azeem, W. Ahmed, and A. V. Babu, "Quantitative evaluation of performance and validity indices for clustering the web navigational sessions," arXiv preprint arXiv:1507.03340, 2015.