2013
Cite Score
55
AI summary
This paper introduces CNNs to LVCSR tasks, achieving a 13-30% relative improvement over GMMs and a 4-12% relative improvement over DNNs on the Broadcast News and Switchboard tasks. It explores different CNN architectures, including the number of convolutional layers, hidden units, pooling strategy, and input feature types.
Main Contributions
Abstract
Convolutional Neural Networks (CNNs) are an alternative type of neural network that can be used to reduce spectral variations and model spectral correlations which exist in signals. Since speech signals exhibit both of these properties, CNNs are a more effective model for speech compared to Deep Neural Networks (DNNs). In this paper, we explore applying CNNs to large vocabulary speech tasks. First, we determine the appropriate architecture to make CNNs effective compared to DNNs for LVCSR tasks. Specifically, we focus on how many convolutional layers are needed, what is the optimal number of hidden units, what is the best pooling strategy, and the best input feature type for CNNs. We then explore the behavior of neural network features extracted from CNNs on a variety of LVCSR tasks, comparing CNNs to DNNs and GMMs. We find that CNNs offer between a 13-30% relative improvement over GMMs, and a 4-12% relative improvement over DNNs, on a 400-hr Broadcast News and 300-hr Switchboard task.
Citation Graph
References [15]
Yann Lecun, Leon Bottou, Yoshua Bengio, Patrick Haffner - 1998
62 papers in library cite
Geoffrey Hinton - 2012
21 papers in library cite
G. Dahl, D. Yu, L. Deng, Alex Acero - 2012
19 papers in library cite
Yann Lecun, Fu Jie Huang, Leon Bottou - 2004
18 papers in library cite
Navdeep Jaitly, P. Nguyen, A. Senior, Vincent Vanhoucke - 2012
6 papers in library cite
F. Seide, G. Li, D. Yu - 2011
4 papers in library cite
O. A. Hamid, A. Mohamed, H. Jiang, G. Penn - 2012
3 papers in library cite
T. N. Sainath, Brian Kingsbury, Bhuvana Ramabhadran - 2012
3 papers in library cite
Yann Lecun, Yoshua Bengio - 1995
3 papers in library cite
S. Lawrence, C. Giles, A. Tsoi, A. Back - 1997
3 papers in library cite
Brian Kingsbury - 2009
3 papers in library cite
Brian Kingsbury, T. N. Sainath, H. Soltau - 2012
3 papers in library cite
H. Soltau, G. Saon, Brian Kingsbury - 2010
3 papers in library cite
T. N. Sainath, Brian Kingsbury, Bhuvana Ramabhadran, P. Fousek, P. Novak, A. Mohamed - 2011
2 papers in library cite
A. Mohamed, Geoffrey Hinton, G. Penn - 2012
2 papers in library cite
Cited by
2
papers in your library
Cites
5
papers in your library
Read
on October 19, 2025
Your review
Tags
Paper Aliases
No aliases