Efficient Text Classification of 20 Newsgroup Dataset using Classification Algorithm

Karishma Borkar, Prof. Nutan Dhande


Text classification is the undertaking of naturally sorting an arrangement of archives into classifications from a predefined set. Content Classification is an information mining procedure used to anticipate bunch enrollment for information occurrences inside a given dataset. It is utilized for ordering information into various classes by thinking of some as compels. Rather than conventional component determination systems utilized for content archive grouping. We present another model in view of likelihood and over all class recurrence of term. The Naive Bayesian classifier depends on Bayes hypothesis with autonomy presumptions between indicators. A Naive Bayesian model is anything but difficult to work, with no confounded iterative parameter estimation which makes it especially valuable for substantial datasets. The paper demonstrates that the new probabilistic translation of tf×idf term weighting may prompt better comprehension of measurable positioning instruments.

, K. B. P. N. D. (2017). Efficient Text Classification of 20 Newsgroup Dataset using Classification Algorithm. International Journal on Recent and Innovation Trends in Computing and Communication, 5(6), 1236–. https://doi.org/10.17762/ijritcc.v5i6.934