Protecting Children from Harmful Audio Content: Automated Profanity Detection From English Audio in Songs and Social-Media

T Senthil  Murugan; V Sai Pavan  Kalyan

doi:10.17762/ijritcc.v11i6.6770

PDF

Published: Jul 10, 2023

DOI: https://doi.org/10.17762/ijritcc.v11i6.6770

Keywords:

Machine Learning, Classification, TF-IDF, BERT, DOC2VEC, Profanity Detection

T Senthil Murugan

Dept. of Information Technology, Kakatiya Institute of Technology & Science Warangal, India

V Sai Pavan Kalyan

Dept. of Information Technology, Kakatiya Institute of Technology & Science Warangal, India

Abstract

A novel approach for the automated detection of profanity in English audio songs using machine learning techniques. One of the primary drawbacks of existing systems is only confined to textual data. The proposed method utilizes a combination of feature extraction techniques and machine learning algorithms to identify profanity in audio songs. Specifically, the approach employs the popular feature extraction techniques of Term frequency–inverse document frequency (TF-IDF), Bidirectional Encoder Representations from Transformers (BERT) and Doc2vec to extract relevant features from the audio songs. TF-IDF is used to capture the frequency and importance of each word in the song, while BERT is utilized to extract contextualized representations of words that can capture more nuanced meanings. To capture the semantic meaning of words in audio songs, also explored the use of the Doc2Vec model, which is a neural network-based approach that can extract relevant features from the audio songs. The study utilizes Open Whisper, an open-source machine learning library, to develop and implement the approach. A dataset of English audio songs was used to evaluate the performance of the proposed method. The results showed that both the TF-IDF and BERT models outperformed the Doc2Vec model in terms of accuracy in identifying profanity in English audio songs. The proposed approach has potential applications in identifying profanity in various forms of audio content, including songs, audio clips, social media, reels, and shorts.

How to Cite

Murugan, T. S. ., & Kalyan, V. S. P. . (2023). Protecting Children from Harmful Audio Content: Automated Profanity Detection From English Audio in Songs and Social-Media. International Journal on Recent and Innovation Trends in Computing and Communication, 11(6), 39–44. https://doi.org/10.17762/ijritcc.v11i6.6770

Issue

Vol. 11 No. 6 (2023): June (2023) Issue

Section

Articles

References

V. Gupta, R. Sharon, R. Sawhney, and D. Mukherjee, "ADIMA: Abuse Detection In Multilingual Audio," in 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 6172-6176.

H. Chin, J. Kim, Y. Kim, J. Shin and M. Y. Yi, "Explicit Content Detection in Music Lyrics Using Machine Learning," 2018 IEEE International Conference on Big Data and Smart Computing (BigComp), Shanghai, China, 2018, pp. 517-521, doi: 10.1109/BigComp.2018.00085.

B. Mathew, P. Saha, S. M. Yimam, C. Biemann, P. Goyal, and A. Mukherjee, “HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection”, AAAI, 2021, vol. 35, no. 17, pp. 14867-14875, doi: https://doi.org/10.1609/aaai.v35i17.17745.

M. Fell, E. Cabrio, M. Corazza and F. Gandon, "Comparing Automated Methods to Detect Explicit Content in Song Lyrics," 2019 International Conference on Recent Advances in Natural Language Processing (RANLP), Varna, Bulgaria, 2019, pp. 338-344, doi: 10.26615/978-954-452-056-4_039.

M. Rospocher and S. Eksir, “Assessing Fine-Grained Explicitness of Song Lyrics,” Information, vol. 14, no. 3, p. 159, Mar. 2023, doi: 10.3390/info14030159.

Citation Indices	All	Since 2018
Citation	5854	3996
h-index	28	23
i10-index	119	72

Year	Rate
2019	12.6%
2018	18.3%
2017	16.9%
2016	18.8%
2015	22.9%
2014	28.9%
2013	26.1%

Protecting Children from Harmful Audio Content: Automated Profanity Detection From English Audio in Songs and Social-Media

Abstract

References

Contact Us:

Auricle Global Society of Education and Research

Y-18-A, Near Sanskar Play School, Sudarshana Nagar,

Bikaner, Rajasthan (India). Pin 334003

: editor@ijritcc.org

Quick Links:

Article Sidebar

Main Article Content

Abstract

Article Details

References

Contact Us:

Auricle Global Society of Education and Research

Y-18-A, Near Sanskar Play School, Sudarshana Nagar,

Bikaner, Rajasthan (India). Pin 334003

: editor@ijritcc.org

Quick Links: