Attribute Selection Algorithm with Clustering based Optimization Approach based on Mean and Similarity Distance

Main Article Content

Rajasekhar Kaseebhotla
K. Raghava Rao
Mallikarjuna Rao

Abstract

With hundreds or thousands of attributes in high-dimensional data, the computational workload is challenging. Attributes that have no meaningful influence on class predictions throughout the classification process increase the computing load. This article's goal is to use attribute selection to reduce the size of high-dimensional data, which will lessen the computational load. Considering selected attribute subsets that cover all attributes. As a result, there are two stages to the process: filtering out superfluous information and settling on a single attribute to stand in for a group of similar but otherwise meaningless characteristics. Numerous studies on attribute selection, including backward and forward selection, have been undertaken. This experiment and the accuracy of the categorization result recommend a k-means based PSO clustering-based attribute selection. It is likely that related attributes are present in the same cluster while irrelevant attributes are not identified in any clusters. Datasets for Credit Approval, Ionosphere, Annealing, Madelon, Isolet, and Multiple Attributes are employed alongside two other high-dimensional datasets. Both databases include the class label for each data point. Our test demonstrates that attribute selection using k-means clustering may be done to offer a subset of characteristics and that doing so produces classification outcomes that are more accurate than 80%.

Article Details

How to Cite
Kaseebhotla, R. ., Rao, K. R. ., & Rao, M. . (2023). Attribute Selection Algorithm with Clustering based Optimization Approach based on Mean and Similarity Distance. International Journal on Recent and Innovation Trends in Computing and Communication, 11(8s), 585–594. https://doi.org/10.17762/ijritcc.v11i8s.7241
Section
Articles

References

S. Tiruveedhula, V. Narayana, “A Survey on Clustering Techniques for Big Data Mining”, Indian Journal of Science and Technology Vol 9(3), DOI: 10.17485/ijst/2016/v9i3/75971, January 2016, pp. 1-12.

Yadav C, Wang S, Kumar M. Algorithms and approaches to handle large data sets - A survey. International Journal of Computer Science and Network. 2013; 2(3):1–5.

Fahad A, Alshatri N, Tari Z, Alamri A. A survey of clustering algorithms for Big Data: Taxonomy and empirical analysis. IEEE Transactions on Emerging Topics in Computing. 2014 Sep; 2(3):267–79.

Berkhin P. Survey of clustering data mining techniques in grouping multidimensional data. Springer. 2006; 25–71.

Xu R, Wunsch D. Survey of clustering algorithms. IEEE Trans¬actions on Neural Networks. 2005 May; 16(3):645–78.

Kailing K, Kriegel HP, Kroger P. Density-connected subspace clustering for high- dimensionality data. Proceedings of the 2004 SIAM International Conference on Data Mining; 2010. p. 246–57.

Liu, C., Zhou, A., & Zhang, G. Automatic clustering method based on evolutionary optimisation. IET Computer Vision, (2013). 7(4), 258–271.

S. Ren and A. Fan, "K-means clustering algorithm based on coefficient of variation," Image and Signal Processing (CISP), 2011 4th International Congress on, Shanghai,2011, pp.2076-2079.

Bin Zeng, Chao Luo, Wei Zhao and Benyue Chen, "The optimization arithmetic of K-means clustering based on Indirect Feature Weight Learning," 2010 International Conference on Computer and Communication Technologies in Agriculture Engineering, Chengdu,2010,pp.243-246.

A. Aghamohseni and R. Ramezanian, "An efficient hybrid approach based on K-means and generalized fashion algorithms for cluster analysis," AI & Robotics (IRANOPEN),2015, Qazvin,2015,pp.1-7.

Dwarkanath Pande, S. ., & Hasane Ahammad, D. S. . (2022). Cognitive Computing-Based Network Access Control System in Secure Physical Layer. Research Journal of Computer Systems and Engineering, 3(1), 14–20. Retrieved from https://technicaljournals.org/RJCSE/index.php/journal/article/view/36

Bin Lu and Fangyuan Ju, "An optimized genetic K-means clustering algorithm," Computer Science and Information Processing (CSIP), 2012 International Conference on, Xi'an, Shaanxi, 2012, pp. 1296-1299.

Sun Xu, Zhang Bing, Yang Lina, Li Shanshan and Gao Lianru, "Hyperspectal image clustering using ant colony optimization (ACO) improved by K-means algorithm," 2010 3rd International Conference on Advanced Computer Theory and Engineering(ICACTE), Chengdu, 2010, pp. V2-474-V2-478.

R. F. Abdel-Kader, "Genetically Improved PSO Algorithm for Efficient Data Clustering," Machine Learning and Computing (ICMLC), 2010 Second International Conference on, Bangalore, 2010, pp. 71-75.

R. Bhavani, G. Sudha Sadasivam and R. Kumaran, "A novel parallel hybrid K-means-DE-ACO clustering approach for genomic clustering using MapReduce," Information and Communication Technologies (WICT), 2011 World Congress on, Mumbai, 2011, pp.132-137.

Wen Xiong and Cong Wang, "A novel hybrid clustering based on adaptive ACO and PSO," Computer Science and Service System (CSSS), 2011 International Conference on, Nanjing,2011,pp.1960-1963.

B. Naik, S. Swetanisha, D. K. Behera, S. Mahapatra and B. K. Padhi, "Cooperative swarm based clustering algorithm based on PSO and k-means to find optimal cluster centroids," Computing and Communication Systems (NCCCS), 2012 National Conferenceon, Durgapur,2012,pp.1-5.

H. A. Atabay, M. J. Sheikhzadeh and M. Torshizi, "A clustering algorithm based on integration of K-Means and PSO," 2016 1st Conference on Swarm Intelligence and Evolutionary Computation (CSIEC), Bam, Iran, 2016, pp. 59-63.

L. Tan, "A Clustering K-means Algorithm Based on Improved PSO Algorithm," Communication Systems and Network Technologies (CSNT), 2015 Fifth International Conference on, Gwalior, 2015, pp. 940-944.

A. Ahmadyfard and H. Modares, "Combining PSO and k-means to enhance data clustering," Telecommunications, 2008. IST 2008. International Symposium on, Tehran, 2008, pp.688-691.

D. C. Tran, Z. Wu and V. X. Nguyen, "A new approach based on enhanced PSO with neighborhood search for data clustering," Soft Computing and Pattern Recognition (SoCPaR), 2013 International Conference of, Hanoi, 2013, pp. 98-104.

D. Yazdani, S. Golyari and M. R. Meybodi, "A new hybrid approach for data clustering," Telecommunications (IST), 2010 5th International Symposium on, Tehran,2010, pp.914-919.

A. Abraham, S. Das and A. Konar, "Document Clustering Using Differential Evolution," 2006 IEEE International Conference on Evolutionary Computation, Vancouver, BC, 2006, pp. 1784-1791.

K. Lu, K. Fang and G. Xie, "A Hybrid Quantum-Behaved Particle Swarm Optimization Algorithm for Clustering Analysis," Fuzzy Systems and Knowledge Discovery, 2008. FSKD '08. Fifth International Conference on, Shandong, 2008, pp. 21-25.

Y. Kao and S. Y. Lee, "Combining K-means and particle swarm optimization for dynamic data clustering problems," Intelligent Computing and Intelligent Systems, 2009. ICIS 2009. IEEE International Conference on, Shanghai, 2009, pp. 757-761.

A. D. Thakare, C. A. Dhote and S. M. Chaudhari, "Intelligent hybrid approach for data clustering," Communication and Computing (ARTCom 2013), Fifth International Conference on Advances in Recent Technologies in, Bangalore, 2013, pp.102-107.

Ling Chen, Li Tu and Hong-Jian Chen, "Data clustering by ant colony on a digraph," 2005 International Conference on Machine Learning and Cybernetics, Guangzhou, China, 2005, pp. 1686-1692Vol.3.

Madhu Y., Srinivasa R. P., Srinivasa T M, “Enhancing K-means Clustering Algorithm with Improved Initial Center”, International Journal of Computer Science and Information Technologies (2010), Vol. 1 (2), pp. 121-125.

Abdul N. and Sebastian M. P., “Improving the Accuracy and Efficiency of the k-means Clustering Algorithm”, Proceedings of the World Congress on Engineering (2009), Vol I.

Romero-Zaliz, R. C., Rubio-Escudero, C., Cobb, J. P., Herrera, F., Cordon, O., & Zwir, I. A multiobjective evolutionary conceptual clustering methodology for gene annotation within structural databases: A case of study on the gene ontology database. IEEE Transactions on Evolutionary Computation, (2008, December). 12(6), 679–701.

Drown, D. J., Khoshgoftaar, T. M., & Seliya, N. Evolutionary sampling and software quality modelling of high-assurance systems. IEEE Transactions on Systems, Man, and Cybernetics— Part a. Systems and Humans, (2009, September). 39(5), 1097–1107.

Das, S., Abraham, A., & Konar, A. Automatic clustering using an improved differential evolution algorithm. IEEE Transactions on Systems, Man, and Cybernetics—Part a. Systems and Humans, (2008, January). 38(1), 218–237.