Recurrent Neural Networks for End-to-End Speech Recognition: A Comparative Analysis

Main Article Content

Gauri Dhande, Prof. Zaheed Shaikh


Speech Recognition is correctly transcribing the spoken utterances by the machine. A new area that is emerging for the representation of the sequential data, such as Speech Recognition is Deep Learning. Deep Learning frameworks such as Recurrent Neural Networks(RNNs) were successful in replacing the traditional speech models such as Hidden Markov Model and Gaussian mixtures. These frameworks boosted the recognition performances to a large context. RNNs being used for sequence to sequence modeling, is a powerful tool for sequence labeling. End-to-End methods such as Connectionist Temporal Classification(CTC) is used with RNNs for Speech Recognition. This paper represents a comparative analysis of RNNs with End-to-End Speech Recognition. Models are trained with different RNN architectures such as Simple RNN cells(SRNN), Long Short Term Memory(LSTMs), Gated Recurrent Unit(GRUs) and even a bidirectional RNNs using all these is compared on Librispeech corpse.

Article Details

How to Cite
, G. D. P. Z. S. “Recurrent Neural Networks for End-to-End Speech Recognition: A Comparative Analysis”. International Journal on Recent and Innovation Trends in Computing and Communication, vol. 6, no. 4, Apr. 2018, pp. 88-93, doi:10.17762/ijritcc.v6i4.1523.