Effective Feature Selection Methods for User Sentiment Analysis using Machine Learning

Main Article Content

Sofiya S. Mujawar
Pawan R. Bhaladhare


Text classification is the method of allocating a particular piece of text to one or more of a number of predetermined categories or labels. This is done by training a machine learning model on a labeled dataset, where the texts and their corresponding labels are provided. The model then learns to predict the labels of new, unseen texts. Feature selection is a significant step in text classification as it helps to identify the most relevant features or words in the text that are useful for predicting the label. This can include things like specific keywords or phrases, or even the frequency or placement of certain words in the text. The performance of the model can be improved by focusing on the features that are most important to the information that is most likely to be useful for classification. Additionally, feature selection can also help to reduce the dimensionality of the dataset, making the model more efficient and easier to interpret. A method for extracting aspect terms from product reviews is presented in the research paper. This method makes use of the Gini index, information gain, and feature selection in conjunction with the Machine learning classifiers. In the proposed method, which is referred to as wRMR, the Gini index and information gain are utilized for feature selection. Following that, machine learning classifiers are utilized in order to extract aspect terms from product reviews. A set of customer testimonials is used to assess how well the projected method works, and the findings indicate that in terms of the extraction of aspect terms, the method that has been proposed is superior to the method that has been traditionally used. In addition, the recommended approach is contrasted with methods that are currently thought of as being state-of-the-art, and the comparison reveals that the proposed method achieves superior performance compared to the other methods. In general, the method that was presented provides a promising solution for the extraction of aspect terms, and it can also be utilized for other natural language processing tasks.

Article Details

How to Cite
Mujawar, S. S. ., and P. R. . Bhaladhare. “Effective Feature Selection Methods for User Sentiment Analysis Using Machine Learning”. International Journal on Recent and Innovation Trends in Computing and Communication, vol. 11, no. 3s, Mar. 2023, pp. 37-45, https://ijritcc.org/index.php/ijritcc/article/view/6153.


M. M. Miro?czuk and J. Protasiewicz, “A recent overview of the state-of-the-art elements of text classification,” Expert Syst. Appl., vol. 106, pp. 36–54, 2018, doi: 10.1016/j.eswa.2018.03.058.

T. S. Guzella and W. M. Caminhas, “A review of machine learning approaches to Spam filtering,” Expert Syst. Appl., vol. 36, no. 7, pp. 10206–10222, 2009, doi: 10.1016/j.eswa.2009.02.037.

Y. Perez-Riverol, M. Kuhn, J. A. Vizcaíno, M. P. Hitz, and E. Audain, “Accurate and fast feature selection workflow for high-dimensional omics data,” PLoS One, vol. 12, no. 12, pp. 1–14, 2017, doi: 10.1371/journal.pone.0189875.

Y. Zhang, D. W. Gong, X. Y. Sun, and Y. N. Guo, “A PSO-based multi-objective multi-label feature selection method in classification,” Sci. Rep., vol. 7, no. 1, pp. 1–12, 2017, doi: 10.1038/s41598-017-00416-0.

S. Solorio-Fernández, J. F. Martínez-Trinidad, and J. A. Carrasco-Ochoa, “A new Unsupervised Spectral Feature Selection Method for mixed data: A filter approach,” Pattern Recognit., vol. 72, pp. 314–326, 2017, doi: 10.1016/j.patcog.2017.07.020.

H. M. Abdulwahab, S. Ajitha, and M. A. N. Saif, “Feature selection techniques in the context of big data: taxonomy and analysis,” Appl. Intell., vol. 52, no. 12, pp. 13568–13613, 2022, doi: 10.1007/s10489-021-03118-3.

J. T. Pintas, L. A. F. Fernandes, and A. C. B. Garcia, Feature selection methods for text classification: a systematic literature review, vol. 54, no. 8. Springer Netherlands, 2021.

S. Solorio-Fernández, J. A. Carrasco-Ochoa, and J. F. Martínez-Trinidad, “A review of unsupervised feature selection methods,” Artif. Intell. Rev., vol. 53, no. 2, pp. 907–948, 2020, doi: 10.1007/s10462-019-09682-y.

M. Paniri, M. B. Dowlatshahi, and H. Nezamabadi-pour, “MLACO: A multi-label feature selection algorithm based on ant colony optimization,” Knowledge-Based Syst., vol. 192, p. 105285, 2020, doi: 10.1016/j.knosys.2019.105285.

G. Ansari, T. Ahmad, and M. N. Doja, “Hybrid Filter–Wrapper Feature Selection Method for Sentiment Classification,” Arab. J. Sci. Eng., vol. 44, no. 11, pp. 9191–9208, 2019, doi: 10.1007/s13369-019-04064-6.

X. Deng, Y. Li, J. Weng, and J. Zhang, “Feature selection for text classification: A review,” Multimed. Tools Appl., vol. 78, no. 3, pp. 3797–3816, 2019, doi: 10.1007/s11042-018-6083-5.

F. Jimenez, C. Martinez, E. Marzano, J. T. Palma, G. Sanchez, and G. Sciavicco, “Multiobjective Evolutionary Feature Selection for Fuzzy Classification,” IEEE Trans. Fuzzy Syst., vol. 27, no. 5, pp. 1085–1099, 2019, doi: 10.1109/TFUZZ.2019.2892363.

J. Lee, I. Yu, J. Park, and D. W. Kim, “Memetic feature selection for multilabel text categorization using label frequency difference,” Inf. Sci. (Ny)., vol. 485, pp. 263–280, 2019, doi: 10.1016/j.ins.2019.02.021.

R. B. Pereira, A. Plastino, B. Zadrozny, and L. H. C. Merschmann, “Categorizing feature selection methods for multi-label classification,” Artif. Intell. Rev., vol. 49, no. 1, pp. 57–78, 2018, doi: 10.1007/s10462-016-9516-4.

A. K. Uysal, “On Two-Stage Feature Selection Methods for Text Classification,” IEEE Access, vol. 6, pp. 43233–43251, 2018, doi: 10.1109/ACCESS.2018.2863547.

H. Peng et al., “Large-scale hierarchical text classification with recursively regularized deep graph-CNN,” Web Conf. 2018 - Proc. World Wide Web Conf. WWW 2018, pp. 1063–1072, 2018, doi: 10.1145/3178876.3186005.

D. S. Guru, M. Suhil, L. N. Raju, and N. V. Kumar, “An alternative framework for univariate filter based feature selection for text categorization,” Pattern Recognit. Lett., vol. 103, pp. 23–31, 2018, doi: 10.1016/j.patrec.2017.12.025.

Z. Hu, Y. Bao, T. Xiong, and R. Chiong, “Hybrid filter-wrapper feature selection for short-term load forecasting,” Eng. Appl. Artif. Intell., vol. 40, pp. 17–27, 2015, doi: 10.1016/j.engappai.2014.12.014.

K. K. Bharti and P. K. Singh, “Hybrid dimension reduction by integrating feature selection with feature extraction method for text clustering,” Expert Syst. Appl., vol. 42, no. 6, pp. 3105–3114, 2015, doi: 10.1016/j.eswa.2014.11.038.