A Comprehensive Review of Sentiment Analysis on Indian Regional Languages: Techniques, Challenges, and Trends

Main Article Content

Sunil D. Kale
Rajesh Prasad
Girish P. Potdar
Parikshit N. Mahalle
Deepak T. Mane
Gopal D. Upadhye

Abstract

Sentiment analysis (SA) is the process of understanding emotion within a text. It helps identify the opinion, attitude, and tone of a text categorizing it into positive, negative, or neutral. SA is frequently used today as more and more people get a chance to put out their thoughts due to the advent of social media. Sentiment analysis benefits industries around the globe, like finance, advertising, marketing, travel, hospitality, etc. Although the majority of work done in this field is on global languages like English, in recent years, the importance of SA in local languages has also been widely recognized. This has led to considerable research in the analysis of Indian regional languages. This paper comprehensively reviews SA in the following major Indian Regional languages: Marathi, Hindi, Tamil, Telugu, Malayalam, Bengali, Gujarati, and Urdu. Furthermore, this paper presents techniques, challenges, findings, recent research trends, and future scope for enhancing results accuracy.

Article Details

How to Cite
Kale, S. D. ., Prasad, R. ., Potdar, G. P. ., Mahalle, P. N. ., Mane, D. T. ., & Upadhye, G. D. . (2023). A Comprehensive Review of Sentiment Analysis on Indian Regional Languages: Techniques, Challenges, and Trends. International Journal on Recent and Innovation Trends in Computing and Communication, 11(9s), 93–110. https://doi.org/10.17762/ijritcc.v11i9s.7401
Section
Articles

References

Snehal Pawar, Swati Mali, "Sentiment Analysis in the Marathi Language," IJRITCC, vol. 5, no. 8, pp. 21-25, Aug. 2017. Sentiment Analysis in the Marathi Language | International Journal on Recent and Innovation Trends in Computing and Communication (ijritcc.org)

Sujata Deshmukh, Nileema Patil, Surabhi Rotiwar, Jason Nunes, “Sentiment Analysis of Marathi Language, "IJRPET, vol. 3, no. 6, pp. 93-97, Jun. 2017. SENTIMENT ANALYSIS OF MARATHI LANGUAGE

Atharva Kulkarni, Meet Mandhane, Manali Likhitkar, Gayatri Kshirsagar, Raviraj Joshi, “L3CubeMahaSent: A Marathi Tweet-based Sentiment Analysis Dataset”, arXiv:2103.11408v1 cs.CL], 21 Mar 2021. https://arxiv.org/abs/2103.11408v1

Chitra Chaudhari, Ashwini Khaire, Rashmi Murtadak, Komal Sirsulla, "Sentiment Analysis in Marathi using Marathi WordNet," IJIR, vol. 3, no. 4, pp. 1253-1256, 2017. Sentiment Analysis in Marathi using Marathi WordNet

Prafulla Bafna, Jatinderkumar Saini, “Marathi Text Analysis using Unsupervised Learning and Word Cloud”, IJEAT, vol. 9, no. 3, pp. 338-343, Feb 2020. International Journal of Recent Technology and Engineering (IJRTE)

Harry Gavali, “Text Sentiment Analysis of Marathi Language in English And Devanagari Script”, Dublin Business School, Jan. 2020. https://esource.dbs.ie/bitstream/handle/10788/4216/msc_gavali_h_2020.pdf

Renuka Naukarkar, Dr. A. N. Thakare, “A Review on Recognition of Sentiment Analysis of Marathi Tweets using Machine Learning Concept”, IJSRSET, vol. 8, no. 2, pp. 190-193, Mar. 2021. IJSRSET

Monali Patil, Nandini Chaudhari, B.V. Pawar, Ram Bhavsar, “Exploring various emotion-shades for Marathi Sentiment Analysis”, 2021 Asian Conference on Innovation in Technology (ASIANCON), pp. 1-5, 2021. https://ieeexplore.ieee.org/document/9544961

Manisha Date, “Sentiment analysis of Marathi news using LSTM”, IJIT, vol. 13, 2021. https://link.springer.com/article/10.1007%2Fs41870-021-00702-1

Kale Sunil Digamberrao, Rajesh S. Prasad, Author Identification using Sequential Minimal Optimization with rule-based Decision Tree on Indian Literature in Marathi, Procedia Computer Science, Volume 132, 2018, Pages 1086-1101, https://doi.org/10.1016/j.procs.2018.05.024.

Kale, Sunil Digambarrao and Rajesh S. Prasad. “Influence of Language-Specific Features for Author Identification on Indian Literature in Marathi.” (2019).

Kale, Sunil Digamberrao and Rajesh S. Prasad. “Author Identification on Imbalanced Class Dataset of Indian Literature in Marathi.” International Journal of Computer Sciences and Engineering (2018).

Kale, Sunil Digamberrao and Rajesh Shardanand Prasad. “A Systematic Review on Author Identification Methods.” Int. J. Rough Sets Data Anal. 4 (2017): 81-91.

Kale. Sunil Digamberrao and R. S. Prasad, "Author Identification on Literature in Different Languages: A Systematic Survey," 2018 International Conference On Advances in Communication and Computing Technology (ACCT), 2018, pp. 174-181, DOI: 10.1109/ICACCT.2018.8529635.

Amidwar, Shubhesh et al. “Text Analysis for Author Identification using Machine Learning.” Journal of emerging technologies and innovative research (2017):

Mohammed Ansari, Sharvari Govilkar, “Sentiment Analysis of Transliterated Hindi and Marathi Script”, Sixth International Conference on Computational Intelligence and Information Technology – CIIT , pp. 142-149, 2016. (PDF) Sentiment Analysis of Transliterated Hindi and Marathi Script

Sonali Shah, Abhishek Kaushik, “Sentiment Analysis on Indian Indigenous Languages: A Review on Multilingual Opinion Mining”, Preprints, 2019110338, 2019. Sentiment Analysis on Indian Indigenous Languages: A Review on Multilingual Opinion Mining

Balamurali A R, Aditya Joshi, Pushpak Bhattacharyya, “Cross-Lingual Sentiment Analysis for Indian Languages using Linked WordNets ", Proceedings of COLING 2012: Posters, pp. 73-82, Dec. 2012. (PDF) Cross-Lingual Sentiment Analysis for Indian Languages using Linked WordNets

Deepali Londhe, Aruna Kumari, Emmanuel M., “Language Identification for Multilingual Sentiment Examination”, IJRTE, vol 8, no. 2S11, pp. 3571-3576, Sep. 2019. Language Identification for Multilingual Sentiment Examination

Piyush Arora, “Sentiment Analysis For Hindi Language”, International Institute of Information Technology Hyderabad - 500 032, April 2013. Sentiment Analysis For Hindi Language

Namita Mittal, Basant Agarwal, Garvit Chouhan, Nitin Bania, Prateek Pareek, “Sentiment Analysis of Hindi Review based on Negation and Discourse Relation”, IJCNLP, pp. 45-50, Oct 2013. Sentiment Analysis of Hindi Reviews based on Negation and Discourse Relation

Naman Bansal, Umair Ahmed, Amitabha Mukherjee, “Sentiment Analysis in Hindi”, Indian Institute of Technology Kanpur, Sentiment Analysis In Hindi

Bao, Wei, et al. "Will_go at SemEval-2020 Task 9: An Accurate Approach for Sentiment Analysis on Hindi-English Tweets Based on Bert and Pseudo Label Strategy." Proceedings of the Fourteenth Workshop on Semantic Evaluation. 2020.

Thakur, Varsha et al. “Current State of Hinglish Text Sentiment Analysis.” Social Science Research Network (2020): n. pag.

Pathak, Abhilash & Kumar, Sudhanshu & Roy, Partha & Kim, Byung-Gyu. (2021). Aspect-Based Sentiment Analysis in Hindi Language by Ensembling Pre-Trained mBERT Models. Electronics. 10. 2641. 10.3390/electronics10212641.

Sarkar, Kamal. (2020). Heterogeneous classifier ensemble for sentiment analysis of Bengali and Hindi tweets. S?dhan?. 45. 10.1007/s12046-020-01424-z.

Das, Sourav et al. “Sentiment classification with GST tweet data on LSTM based on polarity-popularity model.” S?dhan? 45 (2020): 1-17.

A. Sharmista, Dr. M. Ramaswami, “Sentiment Analysis on Tamil Reviews as Products in Social Media Using Machine Learning Techniques: A Novel Study”, Madurai Kamaraj University Madurai - 625 021, Feb 2020. Sentiment Analysis on Tamil Reviews as Products in Social Media Using Machine Learning Techniques: A Novel Study

Sajeetha Thavareesan, Sinnathamby Mahesan, “Review On Sentiment Analysis In Tamil Texts”, JSC EUSL, vol. 9, no. 2, pp. 1-19, 2018. Review on sentiment analysis in Tamil texts

Vallikannu Ramanathan, T. Meyyappan, S.M. Thamarai, “Predicting Tamil Movies Sentimental Reviews Using Tamil Tweets”, Journal of Computer Science, vol. 15, no. 11, pp. 1638-1647, 2019. Predicting Tamil Movies Sentimental Reviews Using Tamil Tweets | Journal of Computer Science

Elango, Sivasankar & Krishnakumari, Kalyan & Palani, Balasubramanian. (2021). An enhanced sentiment dictionary for domain adaptation with multi-domain dataset in Tamil language (ESD-DA). Soft Computing. 25. 10.1007/s00500-020-05400-x.

32]Srinivasan, Ramakrishnan and C. N. Subalalitha. “Sentimental analysis from imbalanced code-mixed data using machine learning approaches.” Distributed and Parallel Databases (2021): 1 - 16.

Gokula Krishnan et al, . “TWITTER SENTIMENT ANALYSIS USING ENSEMBLE CLASSIFIERS ON TAMIL AND MALAYALAM LANGUAGES.” OSF, 23 Aug. 2021. Web.

Hande, A deep, et al. "Benchmarking multi-task learning for sentiment analysis and offensive language identification in under-resourced Dravidian languages." arXiv preprint arXiv:2108.03867 (2021).

Dowager, Suman, and Radhika Mamidi. "Graph convolutional networks with multi-headed attention for code-mixed sentiment analysis." Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages. 2021.

36]Chakravarthi, Bharathi Raja, et al. "Corpus creation for sentiment analysis in code-mixed Tamil-English text." arXiv preprint arXiv:2006.00206 (2020).

Sandeep Mukku, “Sentiment Analysis for Telugu Language”, International Institute of Information Technology Hyderabad - 500 032, Dec. 2017. (PDF) Sentiment Analysis for Telugu Language (researchgate.net)

Reddy Naidu, Santosh Kumar Bharti, Korra Sathya Babu, Ramesh Kumar Mohapatra, “Sentiment Analysis using Telugu SentiWordNet”, WiSPNET, March 2017. Sentiment analysis using Telugu SentiWordNet | IEEE Conference Publication

Bharti, Santosh Kumar, Reddy Naidu, and Korra Sathya Babu. "Hyperbolic Feature-based Sarcasm Detection in Telugu Conversation Sentences." Journal of Intelligent Systems 30.1 (2021): 73-89.

Badugi, Srinivasu. "Telugu Movie Review Sentiment Analysis Using Natural Language Processing Approach." Data Engineering and Communication Technology. Springer, Singapore, 2020. 685-695.

Suryachandra, Palli, and P. Venkata Subba Reddy. "CLASSIFICATION OF THE FEATURE-LEVEL RATING SENTIMENTS FOR TELUGU LANGUAGE REVIEWS USING WEIGHTED XGBOOST CLASSIFIER." Technology 11.12 (2020): 373-383.

Priya, G. Balakrishna, and M. Usha Rani. "A Framework for Sentiment Analysis of Telugu Tweets." International Journal of Engineering and Advanced Technology (IJEAT) 9.6 (2020).

Suryachandra, Palli, and P. Venkata Subba Reddy. "CLASSIFICATION OF THE SENTIMENT VALUE OF NATURAL LANGUAGE PROCESSING IN TELUGU DATA USING ADABOOSTER CLASSIFIER."

Chakravarthi, Bharathi Raja, et al. "A sentiment analysis dataset for code-mixed Malayalam-English." arXiv preprint arXiv:2006.00210 (2020).

Kumar, S. Sachin, M. Anand Kumar, and K. P. Soman. "Identifying Sentiment of Malayalam Tweets Using Deep Learning." Digital Business. Springer, Cham, 2019. 391-408.

Kalaivani, A., and D. Thenmozhi. "SSN_NLP_MLRG@ Dravidian-CodeMix-FIRE2020: Sentiment Code-Mixed Text Classification in Tamil and Malayalam using ULMFiT." FIRE (Working Notes). 2020.

Saiful Islam, Ruhul Amin, Khondoker Islam, "Sentiment analysis in Bengali via transfer learning using multilingual BERT", ICCIT, vol. 23, Jan 2021. (PDF) Sentiment analysis in Bengali via transfer learning using multilingual BERT

Salim Sazzed, Sampath Jayarathna, “A Sentiment Classification in Bengali and Machine Translated English Corpus”, IEEE IRI, vol. 20, pp. 107-114, Aug 2019. A Sentiment Classification in Bengali and Machine Translated English Corpus

Simmi Bagga, Anil Sharma. (2023). Transformation from CIM to PIM for Querying Multi-Paradigm Databases. International Journal of Intelligent Systems and Applications in Engineering, 11(2s), 354–359. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/2717

Soumil Mandal, Sainik Kumar Mahata, Dipankar Das, “Preparing Bengali-English Code-Mixed Corpus for Sentiment Analysis of Indian Languages”, ALR collocated with LREC, vol.13, March 2018. 1803.04000. Preparing Bengali-English Code-Mixed Corpus for Sentiment Analysis of Indian Languages

Serajus Khan, Sanjida Rafa, Al Ekram Abir, Amit Das, “Sentiment Analysis on Bengali Facebook Comments To Predict Fan's Emotions Towards a Celebrity”, JEA, vol. 2 no. 3, pp. 118-124, 2021. Sentiment Analysis on Bengali Facebook Comments To Predict Fan's Emotions Towards a Celebrity | Journal of Engineering Advancements

Dawn, Debapratim Das, Sohrab Hossain Shaikh, and Rajat Kumar Pal. "A comprehensive review of Bengali word sense disambiguation." Artificial Intelligence Review 53.6 (2020): 4183-4213.

Mamun, Md, et al. "Classification of Textual Sentiment Using Ensemble Technique." SN Computer Science 3.1 (2022): 1-13.

Mukherjee, Shibashis, and David R. Heise. "Affective meanings of 1,469 Bengali concepts." Behavior research methods 49.1 (2017): 184-197.

Sarkar, Kamal. "Heterogeneous classifier ensemble for sentiment analysis of Bengali and Hindi tweets." S?dhan? 45.1 (2020): 1-17.

Sharmin, Sadia, and Danial Chakma. "Attention-based convolutional neural network for Bangla sentiment analysis." AI & SOCIETY 36.1 (2021): 381-396.

Parita Shah, Priya Swaminarayan, Maitri Patel, “Sentiment analysis on film review in Gujarati language using machine learning”, IJECE, vol. 12, no. 1, pp. 1030-1039, Feb 2022. http://doi.org/10.11591/ijece.v12i1.pp1030-1039

Vrunda Joshi, Vipul Vekariya, “An Approach to Sentiment Analysis on Gujarati Tweets”, ACST, vol. 10, no. 5, pp. 1487-1493, 2017. An Approach to Sentiment Analysis on Gujarati Tweets

Chandrakant Patel, Jayesh Patel, “Influence of Gujarati STEmmeR in Supervised Learning of Web Page Categorization”, IJISA, vol. 13, no. 3, pp. 23-34, Jun 2021. https://doi.org/10.5815/ijisa.2021.03.03

Lata Gohil, Dharmendra Patel, “A Sentiment Analysis of Gujarati Text using Gujarati Senti word Net”, IJITEE, vol. 8, no. 9, pp. 2290-2293, Jul 2019. International Journal of Soft Computing and Engineering

Shah, Parita, Priya Swaminarayan, and Maitri Patel. "Sentiment analysis on film review in Gujarati language using machine learning." International Journal of Electrical & Computer Engineering (2088-8708) 12.1 (2022).

Gohil, Lata, and Dharmendra Patel. "Multilabel Classification for Emotion Analysis of Multilingual Tweets." Int. J. Innov. Technol. Explore. Eng 9.1 (2019): 4453-4457.

Dhabliya, D. (2021). Feature Selection Intrusion Detection System for The Attack Classification with Data Summarization. Machine Learning Applications in Engineering Education and Management, 1(1), 20–25. Retrieved from http://yashikajournals.com/index.php/mlaeem/article/view/8

Joshi, Vrunda C., and Vipul M. Vekariya. "An approach to sentiment analysis on Gujarati tweets." Advances in Computational Sciences and Technology 10.5 (2017): 1487-1493.

Afraz Syed, Aslam Muhammad, Ana Martinez-Enriquez, “Lexicon Based Sentiment Analysis of Urdu Text Using SentiUnits”, MICAI, pp. 32-43, 2010. Lexicon Based Sentiment Analysis of Urdu Text Using SentiUnits

Sajadul Kumhar, Mudasir Kirmani, Jitendra Sheetlani, Mudasir Hassan, “Sentiment Analysis of Urdu Language on different Social Media Platforms using Word2vec and LSTM”, TURCOMAT , vol. 11, no. 3, pp. 1439-1447, 2020. View of Sentiment Analysis of Urdu Language on different Social Media Platforms using Word2vec and LSTM

Sadaf Rani, Muhammad Anwar, “Resource Creation and Evaluation of Aspect Based Sentiment Analysis in Urdu”, ACL-IJCNLP, vol. 10, pp. 79-84, Dec 2020. Resource Creation and Evaluation of Aspect Based Sentiment Analysis in Urdu

Rakhi Batra, Zemun Kastrati, Ali Imran, Sher Daudpota, Abdul Ghafoor, “A Large-Scale Tweet Dataset For Urdu Text Sentiment Analysis”, PREPRINT, March 2021. A Large-Scale Tweet Dataset for Urdu Text Sentiment Analysis

Tooba Tehreem, Hira Tahir, “Sentiment Analysis for YouTube Comments in Roman Urdu”, Feb 2021. 2102.10075. Sentiment Analysis for YouTube Comments in Roman Urdu

There, Tooba. "Sentiment Analysis for YouTube Comments in Roman Urdu." arXiv preprint arXiv:2102.10075 (2021).

Khattak, Asad, et al. "A survey on sentiment analysis in Urdu: A resource-poor language." Egyptian Informatics Journal 22.1 (2021): 53-74.

Shah, Syed Muhammad Waqas, Muhammad Nadeem, and Muzamil Mehboob. "Sentiment Analysis of Roman-Urdu Tweets about Covid-19 Using Machine Learning Approach: A Systematic Literature." International Journal 10.2 (2021).

Nasim, Zarmeen, and Sayeed Ghani. "Sentiment Analysis on Urdu Tweets Using Markov Chains." SN Computer Science 1.5 (2020): 1-13.

Khan, Lal, et al. "Urdu sentiment analysis with deep learning methods." IEEE Access 9 (2021): 97803-97812.