Handwritten Devanagari Text Recognition using Single Classifier Approach with VSPCA Scheme

Main Article Content

Vijay More
Prashant Yawalkar
Madan Kharat

Abstract

In this research paper we used individual classifier approach for Handwritten Devanagari text recognition. We experimented different categorical classifiers namely   Random Forest Classifier (RFC), Support Vector Machine (SVM), K Nearest Neighbor Classifier (KNN), Logistic Regression Classifier (LogRegr), Decision Tree Classifier (DTree). Seven different feature sets are used namely Eccentricity, Euler Number, Horizontal Histogram, Vertical Histogram, HOG Features, LBP Features, and Statistical Features. The experimentation is carried out on 9434 different characters whose features are extracted from 220 handwritten image documents from PHDIndic_11 dataset. We deduced and implemented a unique scheme namely VSPCA scheme. VSPCA is Vectorization, Scaling, and Principal Component Analysis carried out on all feature sets before being given for model training. We obtained varied accuracies using all these five classifiers on all these six feature sets in which 99.52% highest accuracy is observed.

Article Details

How to Cite
More, V. ., Yawalkar, P. ., & Kharat, M. . (2023). Handwritten Devanagari Text Recognition using Single Classifier Approach with VSPCA Scheme. International Journal on Recent and Innovation Trends in Computing and Communication, 11(11s), 18–31. https://doi.org/10.17762/ijritcc.v11i11s.8066
Section
Articles

References

Sk Md Obaidullah, Supratik Kundu Das, Kaushik Roy, "A System for Handwritten Script Identification from Indian Document", Researchgate: Journal of Pattern Recognition Research 8 (2013) 1-12, Researchgate, 2013, pp.1-12.

Mallikarjun Hangarge, Santosh K.C., Rajmohan Pardeshi, "Directional Discrete Cosine Transform for Handwritten Script Identification", 12th International Conference on Document Analysis and Recognition (ICDAR), 2013, IEEE, 2013, pp.344-348.

Deepti Khanduja, Neeta Nain, and Subhash Panwar, "A Hybrid Feature Extraction Algorithm for Devanagari Script", ACM Trans. Asian Low-Resour. Lang. Inf. Process., Vol. 15, No. 1, Article 2, ACM, 2015, pp.1-10.

K. Roy, S. Kundu Das, Sk Md Obaidullah, "Script Identification from Handwritten Document", Third National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics, 2011, Researchgate, 2011, pp.1-5.

Hiremath P. S., Shivashankar S., Jagdeesh D. Pujari, V. Mouneswara, "Script identification in a handwritten document image using texture features", 2010 IEEE 2nd International Advance Computing Conference, IEEE Explore, IEEE, 2010, pp.110-114.

Dr. Antino Marelino. (2014). Customer Satisfaction Analysis based on Customer Relationship Management. International Journal of New Practices in Management and Engineering, 3(01), 07 - 12. Retrieved from http://ijnpme.org/index.php/IJNPME/article/view/26

G. G. Rajput, Anita H. B, "Handwritten Script Recognition using DCT and Wavelet Features at Block Level", RTIPPR 2010, IJCA - ResearchGate, 2010, pp.158-163.

Mallikarjun Hangarge, B.V.Dhandra, "Offline handwritten script identification in document images", IJCA 2010 Vol.4,No.6, IJCA, 2010, pp.6-10.

Sukalpa Chanda, Srikanta Pal, Katrin Franke, Umapada Pal, "Two-stage Approach for Word-wise Script Identification", 2009 10th ICDAR, IEEE, 2009, pp.926-930.

Akram Baig, M. M. . (2023). An Evaluation of Major Fault Tolerance Techniques Used on High Performance Computing (HPC) Applications. International Journal of Intelligent Systems and Applications in Engineering, 11(3s), 320–328. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/2696

K. Roy, K. Majumder, "Trilingual Script Separation of Handwritten Postal Document", Sixth Indian Conference on Computer Vision, Graphics & Image Processing, 2008. ICVGIP '08. , IEEE, 2008, pp.693-700.

Sandhya Arora, Debotosh Bhattacharjee, Mita Nasipuri, Dipak Kumar Basu, Mahantapas Kundu, "Combining Multiple Feature Extraction Techniques for Handwritten Devnagari Character Recognition", 2008 IEEE Region 10 Colloquium and the Third ICIIS, Kharagpur, INDIA December 8-10., IEEE, 2008, pp.342-1-342-6.

B.V.Dhandra, Mallikarjun Hangarge, "Global and Local Features Based Handwritten Text Words and Numerals Script Identification", International Conference on Computational Intelligence and Multimedia Applications 2007, IEEE, 2007, pp.471-475.

Gopal Datt Joshi, Saurabh Garg, Jayanthi Sivaswamy, "A generalised framework for script identification", IJDAR (2007), Springer, 2007, pp.55-68.

More Vijay, Kharat M U, Gumaste S V, "Study of different features and classification techniques for recognition of handwritten Devanagari text", International Journal of Engineering & Technology (IJET-UAE), ISSN-1055-1059, Vol 7, No. 4, Issue 19, DOI:10.14419/ijet.v7i4.19.28285, 2018, pp.1055-1059.

Vijay More, Madan Kharat, Shyamrao Gumaste, "Segmentation Of Devanagari Handwritten Text Using Thresholding Approach", International Journal of Scientific & Technology Research (IJSTR), IJSTR, 2020, pp.6594-6605.

Basaligheh, P. (2021). A Novel Multi-Class Technique for Suicide Detection in Twitter Dataset. Machine Learning Applications in Engineering Education and Management, 1(2), 13–20. Retrieved from http://yashikajournals.com/index.php/mlaeem/article/view/14

Vijay More, Madan Kharat, "Segmentation of Lines and Words of Handwritten Devanagari Text using Connected Components with Statistics Method", Journal of Scientific Research, Institute of Science, Banaras Hindu University, Varanashi, UP, JSR, 2022, DOI:10.37398/JSR.2022.660224, pp.179-188.

Sk Md Obaidullah, Chayan Halder, K. C. Santosh, Nibaran Das, Kaushik Roy, “PHDIndic_11:Page-level handwritten document image dataset of 11 official Indic scripts for script identification”, in Multimedia Tools and Applications (MTAP), Springer, Volume 77, Issue 2, 2017, doi:10.1007/s11042-017-4373-y, pp.1643–1678.

Juan Garcia, Guðmundsdóttir Anna, Johansson Anna, Maria Jansen, Anna Wagner. Optimal Decision Making in Supply Chain Management using Machine Learning. Kuwait Journal of Machine Learning, 2(4). Retrieved from http://kuwaitjournals.com/index.php/kjml/article/view/209

Gholamy, Afshin; Kreinovich, Vladik; and Kosheleva, Olga, "Why 70/30 or 80/20 Relation Between Training and Testing Sets: A Pedagogical Explanation" (2018). Departmental Technical Reports (CS). 1209. https://scholarworks.utep.edu/cs_techrep/1209.

Heutte, Laurent; Paquet, Thierry; Moreau, Jean Vincent; Lecourtier, Yves; Olivier, Christian, "A structural / statistical feature based vector for handwritten character recognition" Pattern Recognition Letters 19, 1998, Elsevir, pp.629–641.

A. Jain and D. Zongker, "Feature selection: evaluation, application, and small sample performance," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 2, pp. 153-158, Feb. 1997, doi: 10.1109/34.574797.