Integration of MFCC Extraction and LSTM Algorithm on PYNQ-Z2 for Enhanced Audio Analysis

Main Article Content

Sheetal U. Bhandari, Deepti Khurge, Rajani PK, Varsha Bendre, Ashwini S. Shinde

Abstract

The need for Speech Emotion Recognition (SER) is growing since researchers have found it difficult to interpret human emotions from speech data. SER is very interesting yet very challenging   task   of human-computer interaction (HCI). The SER application can be benefitted depending on the type of feature extraction technique and model used for classification. Deep Learning has made a great impact in the field of audio, image, video, EEG and ECG classification. The speech signal characteristics and classification model affect how well the SER application performs.  The paper briefs about deploying Deep Learning Algorithm on FPGA based board i.e., PYNQ-Z2. MFCC feature extraction technique and LSTM model used for classification of human emotion is implemented on the board. Emotion can be predicted using led buttons on the board.

Article Details

How to Cite
Sheetal U. Bhandari, et al. (2023). Integration of MFCC Extraction and LSTM Algorithm on PYNQ-Z2 for Enhanced Audio Analysis. International Journal on Recent and Innovation Trends in Computing and Communication, 11(10), 1177–1185. https://doi.org/10.17762/ijritcc.v11i10.8659
Section
Articles
Author Biography

Sheetal U. Bhandari, Deepti Khurge, Rajani PK, Varsha Bendre, Ashwini S. Shinde

Sheetal U. Bhandari1, Deepti Khurge1, Rajani PK1, Varsha Bendre1, Ashwini S. Shinde1

1Department of Electronics and Telecommunication Engineering, Pimpri Chinchwad College of Engineering, Pune, India.

sheetal.bhandari@pccoepune.org, dipti.khurge@pccoepune.org, rajani.pk@pccoepune.org, varsha.bendre@pccoepune.org, ashwinik09@gmail.com