Menu

On Architectural Issues of Neural Networks in Speech Recognition

calendar icon Jul 31, 2016 1526 views
split view icon
video icon
presentation icon
video with chapters icon
video thumbnail
Pause
Mute
speed icon
speed icon
0.25
0.5
0.75
1
1.25
1.5
1.75
2

Recently, artificial neural networks (ANN) were able to improve the performance of speech recognition systems dramatically. There have been more than 25 years of extensive research on neural networks in speech recognition. Despite this huge effort, there are a number of open issues concerning the architecure of ANN based systems for speech recognition. Examples of such issues are: 1) Unlike the hybrid approach of replacing the emission probability function by an ANN, there is the possibility of a direct approach that models the posterior state sequence of (phonetic) labels directly without using the generative concepts of classicial hidden Markov models (HMM). 2) In the CTC approach (connectionist temporal classification), the HMM is simplified by using a single label per phoneme (or character in handwriting recognition) only. The CTC training criterion is the sum over all possible posterior distributions of label sequences. 3) Recently there have been so-called attention based approaches that replace the conventional HMM formalism by a recurrent neural network. In these three cases, we are faced with the questions of how these ANN based approaches compare with the conventional discriminative framework of hybrid HMMs. We will discuss the advantages and disadvantages of these approaches in more detail and compare them with conventional hybrid HMMs.

RELATED CATEGORIES

MORE VIDEOS FROM THE SAME CATEGORIES

Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International license.