A Comparative Study of Text Classification Methods: An Experimental Approach

Main Article Content

Rupali P.Patil, R. P. Bhavsar, B. V. Pawar

Abstract

Text classification is the process in which text document is assigned to one or more predefined categories based on the contents of document. This paper focuses on experimentation of our implementation of three popular machine learning algorithms and their performance comparative evaluation on sample English Text document categorization. Three well known classifiers namely Naïve Bayes (NB), Centroid Based (CB) and K-Nearest Neighbor (KNN) were implemented and tested on same dataset R-52 chosen from Reuters-21578 corpus. For performance evaluation classical metrics like precision, recall and micro and macro F1-measures were used. For statistical comparison of the three classifiers Randomized Block Design method with T-test was applied. The experimental result exhibited that Centroid based classifier out performed with 97% Micro F1 measure. NB and KNN also produce satisfactory performance on the test dataset, with 91% Micro F1 measure and 89% Micro F1 measure respectively.

Article Details

How to Cite
, R. P. R. P. B. B. V. P. (2016). A Comparative Study of Text Classification Methods: An Experimental Approach. International Journal on Recent and Innovation Trends in Computing and Communication, 4(3), 517–523. https://doi.org/10.17762/ijritcc.v4i3.1930
Section
Articles