Comparative Study of Improving Classifiers Accuracies

Lakshmi Sreenivasa Reddy. D

doi:10.17762/ijritcc.v4i1.1722

PDF

Published: Jan 31, 2016

DOI: https://doi.org/10.17762/ijritcc.v4i1.1722

Lakshmi Sreenivasa Reddy. D

Abstract

Outlier analysis is an essential task in data science to wipe out inconsistencies from data to build a good model. Finding outliers from categorical data is a tough task. To model a good Classifier, it is necessary to eliminate outliers from data. While modeling categorical data, most infrequent records are treated as outliers. These outliers would disturb the entire data in modeling a good classifier. This paper presents the comparison between classifiers accuracies which are built by normally distributed Outlier factor by infrequency (NOFI) to OFI with different inputs. In modeling a classifier for categorical data, high frequent records are most useful and most infrequent records are most useless. So the infrequent records are obstacles in modeling the classifiers. The experiments are conducted for this comparison on bank dataset with 45000 records and Nursery dataset with 14000 records approximately, which are taken from UCI ML Repository. For normally distributed OFI, the inputs are not needed. It generates the number of outliers automatically. In OFI it is needed to give the inputs. However the threshold value is needed to generate infrequent itemsets for both methods.

How to Cite

, L. S. R. D. (2016). Comparative Study of Improving Classifiers Accuracies. International Journal on Recent and Innovation Trends in Computing and Communication, 4(1), 140–144. https://doi.org/10.17762/ijritcc.v4i1.1722