2012
Cite Score
12
AI summary
This paper introduces a subword language model using neural networks, combining character and word-level advantages. It demonstrates that neural network models can be significantly smaller than compressed n-gram models while maintaining performance on the Broadcast news RT04 task, with further size reductions possible through sub-word units and quantization.
Main Contributions
Abstract
We explore the performance of several types of language models on the word-level and the character-level language modeling tasks. This includes two recently proposed recurrent neural network architectures, a feedforward neural network model, a maximum entropy model and the usual smoothed n-gram models. We then propose a simple technique for learning sub-word level units from the data, and show that it combines advantages of both character and word-level models. Finally, we show that neural network based language models can be order of magnitude smaller than compressed n-gram models, at the same level of performance when applied to a Broadcast news RT04 speech recognition task. By using sub-word units, the size can be reduced even more.
Citation Graph
References [23]
D. E. Rumelhart, Geoffrey E. Hinton, Ronald J. Williams - 1986
46 papers in library cite
Jeffrey L. Elman - 1990
23 papers in library cite
Yoshua Bengio, R. Ducharme, Pascal Vincent - 2001
62 papers in library cite
Tomas Mikolov, M. Karafiat, Lukas Burget, Jan Cernocky, Sanjeev Khudanpur - 2010
36 papers in library cite
Andreas Stolcke - 2002
13 papers in library cite
Ilya Sutskever, James Martens, Geoffrey E. Hinton - 2011
13 papers in library cite
Tomas Mikolov, S. Kombrink, Lukas Burget, Jan Cernocky, Sanjeev Khudanpur - 2011
16 papers in library cite
A. Mnih, Geoffrey E. Hinton - 2009
16 papers in library cite
James Martens, Ilya Sutskever - 2011
13 papers in library cite
Tomas Mikolov, A. Deoras, D. Povey, Lukas Burget, Jan Cernocky - 2011
9 papers in library cite
Tomas Mikolov, A. Deoras, S. Kombrink, Lukas Burget, Jan Cernocky - 2011
13 papers in library cite
Lukas Burget - 2008
1 paper in library cites
M. Shaik, A. Mousa, R. Schluter, Hermann Ney - 2011
1 paper in library cites
I. Bazzi - 2002
3 papers in library cite
H. Soltau, G. Saon, Brian Kingsbury - 2010
3 papers in library cite
A. Deoras, Tomas Mikolov, K. Church - 2011
2 papers in library cite
M. Mahoney - 2005
2 papers in library cite
T. Watanabe, H. Tsukada, H. Isozaki - 2009
1 paper in library cites
K. Church, R. Wa, T. Hart, Jianfeng Gao - 2007
1 paper in library cites
C. Parada, Mark Dredze, A. Sethy, A. Rastrow - 2011
1 paper in library cites
M. Kang, T. Ng, L. Nguyen - 2011
1 paper in library cites
P. Matejka - 2009
1 paper in library cites
S. Kombrink, M. Hannemann, Lukas Burget, H. Hermansky - 2010
1 paper in library cites
Cited by
7
papers in your library
Cites
13
papers in your library
Read
on June 20, 2025
Your review
Tags
Paper Aliases
No aliases