Survey on Hinglish to English Translation and Classification Techniques

Main Article Content

Nicole D’Souza
Devarsh Patel
Jigyashu Saravta
Ashwini Rao

Abstract

Code-mixing is the process of using many languages in one sentence and has a widespread occurrence in multilingual communities. It is particularly prevalent in texts on social media. Due to the widespread usage of social networking sites, a substantial amount of unstructured text is produced. Hinglish, i.e. code-mixed Hindi and English, is a frequent occurrence in everyday language use in India. Hence, a translation process is required to help monolingual users and to aid in the comprehension of language processing models. In this paper, we study the effective techniques for classification and translation tasks and also find gaps and challenges in the current research domain. After comparing a few existing methodologies for machine translation, a framework which showed an improvement in task of translation over the previous methods is proposed.  

Article Details

How to Cite
D’Souza, N. ., Patel, D. ., Saravta, J. ., & Rao, A. . (2023). Survey on Hinglish to English Translation and Classification Techniques . International Journal on Recent and Innovation Trends in Computing and Communication, 11(7), 149–155. https://doi.org/10.17762/ijritcc.v11i7.7840
Section
Articles

References

Srivastava, V., Singh, M.: PHINC: A Parallel Hinglish Social Media Code-Mixed Corpus for Machine Translation. (2020).

Jadhav, I., Kanade, A., Waghmare, V., Chandok, S.S., Jarali, A.: Code-Mixed Hinglish to English Language Translation Framework. In: 2022 International Conference on Sustainable Computing and Data Communication Systems (ICSCDS). pp. 684–688.IEEE(2022). https://doi.org/10.1109/ICSCDS53736.2022.9760834.

Gautam, D., Gupta, K., Shrivastava, M.: Translate and Classify: Improving Sequence Level Classification for English-Hindi Code-Mixed Data. In: Proceedings of the Fifth Workshop on Computational Approaches to Linguistic Code-Switching. pp. 15–25. Association for Computational Linguistics, Stroudsburg, PA, USA (2021). https://doi.org/10.18653/v1/2021.calcs-1.3.

Dhawal Khem, Shailesh Panchal, Chetan Bhatt. (2023). Text Simplification Improves Text Translation from Gujarati Regional Language to English: An Experimental Study. International Journal of Intelligent Systems and Applications in Engineering, 11(2s), 316–327. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/2699

Chakrawarti, R.K., Bansal, P.: Approaches for Improving Hindi to English Machine Translation System. Indian J Sci Technol. 10, 1–8 (2017). https://doi.org/10.17485/ijst/2017/v10i16/111895.

Attri, S.H., Prasad, T.V., Ramakrishna, G.: HiPHET: A Hybrid Approach to Translate Code Mixed Language (Hinglish) to Pure Languages (Hindi and English). Computer Science. 21, (2020). https://doi.org/10.7494/csci.2020.21.3.3624.

Dr. B. Maruthi Shankar. (2019). Neural Network Based Hurdle Avoidance System for Smart Vehicles. International Journal of New Practices in Management and Engineering, 8(04), 01 - 07. https://doi.org/10.17762/ijnpme.v8i04.79

Singh, T.D., Solorio, T.: Towards Translating Mixed-Code Comments from Social Media. In: International Conference on Computational Linguistics and Intelligent Text Processing. pp. 457–468 (2018). https://doi.org/10.1007/978-3-319-77116-8_34.

Agarwal Vibhav, Rao Pooja, Jayagopi Dinesh Babu: Hinglish to English Machine Translation using Multilingual Transformers. Proceedings of the Student Research Workshop Associated with RANLP 2021. 16–21 (2021).

Martin, S., Wood, T., Hernandez, M., González, F., & Rodríguez, D. Machine Learning for Personalized Advertising and Recommendation. Kuwait Journal of Machine Learning, 1(4). Retrieved from http://kuwaitjournals.com/index.php/kjml/article/view/156

Sristy, N.B., Krishna, N.S., Krishna, B.S., Ravi, V.: Language Identification in Mixed Script. In: Proceedings of the 9th Annual Meeting of the Forum for Information Retrieval Evaluation. pp. 14–20. ACM, New York, NY, USA (2017). https://doi.org/10.1145/3158354.3158357.

Dhar Mrinal, Kumar Vaibhav, Shrivastava Manish: Enabling Code-Mixed Translation: Parallel Corpus Creation and MT Augmentation Approach. Proceedings of the First Workshop on Linguistic Resources for Natural Language Processing. 131–140 (2018).

Srivastava, V., Singh, M.: Challenges and Considerations with Code-Mixed NLP for Multilingual Societies. (2021).