2016
Cite Score
47
AI summary
This paper introduces an extension of Recurrent Neural Networks (RNNs) for large-scale language modeling using character Convolutional Neural Networks or Long-Short Term Memory on the One Billion Word Benchmark, achieving a perplexity of 23.7 using model ensembling.
Main Contributions
Abstract
In this work we explore recent advances in Recurrent Neural Networks for large scale Language Modeling, a task central to language understanding. We extend current models to deal with two key challenges present in this task: corpora and vocabulary sizes, and complex, long term structure of language. We perform an exhaustive study on techniques such as character Convolutional Neural Networks or Long-Short Term Memory, on the One Billion Word Benchmark. Our best single model significantly improves state-of-the-art perplexity from 51.3 down to 30.0 (whilst reducing the number of parameters by a factor of 20), while an ensemble of models sets a new record by improving perplexity from 41.0 down to 23.7. We also release these models for the NLP and ML community to study and improve upon.
Citation Graph
References [53]
Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton - 2012
71 papers in library cite
Sepp Hochreiter, Jürgen Schmidhuber - 1997
94 papers in library cite
J. Deng, W. Dong, Richard Socher, L. J. Li, K. Li, Li Fei Fei - 2009
28 papers in library cite
Kyunghyun Cho, B. V. Merrienboer, C. G. Gulcehre, D. Bahdanau, F. Bougares, Holger Schwenk, Yoshua Bengio - 2014
38 papers in library cite
Ilya Sutskever, Oriol Vinyals, Quoc V. Le - 2014
58 papers in library cite
M. Abadi, Akshat Agarwal, P. Barham, E. Brevdo, Ziru Chen, C. Citro, G. Corrado, A. Davis, Jeffrey Dean, M. Devin, Sanjay Ghemawat, I. Goodfellow, A. Harp, Geoffrey Irving, M. Isard, Y. Jia, R. Jozefowicz, Lukasz Kaiser, M. Kudlur, J. Levenberg, D. Mane, R. Monga, S. Moore, D. Murray, Christopher Olah, M. Schuster, J. Shlens, B. Steiner, Ilya Sutskever, K. Talwar, P. Tucker, Vincent Vanhoucke, V. Vasudevan, F. Viegas, Oriol Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, Xiaoqiang Zheng - 2015
11 papers in library cite
M. Schuster, Kuldip K. Paliwal - 1997
10 papers in library cite
M. P. Marcus, B. Santorini, Mary Ann Marcinkiewicz - 1993
22 papers in library cite
Felix A. Gers, Jürgen Schmidhuber, Fred Cummins - 2000
13 papers in library cite
Razvan Pascanu, Tomas Mikolov, Yoshua Bengio - 2013
21 papers in library cite
Tomas Mikolov, M. Karafiat, Lukas Burget, Jan Cernocky, Sanjeev Khudanpur - 2010
36 papers in library cite
Yann Lecun, B. Boser, John S. Denker, D. Henderson, R. E. Howard, W. Hubbard, L. D. Jackel - 1990
10 papers in library cite
Alex Graves - 2013
27 papers in library cite
Phil Blunsom, Edward Grefenstette, N. Kalchbrenner - 2014
7 papers in library cite
Wojciech Zaremba, Ilya Sutskever, Oriol Vinyals - 2014
22 papers in library cite
Alexander M. Rush, S. Chopra, Jason Weston - 2015
13 papers in library cite
R. K. Srivastava, K. Greff, Jürgen Schmidhuber - 2015
6 papers in library cite
Oriol Vinyals, Quoc V. Le - 2015
7 papers in library cite
R. Kneser, Hermann Ney - 1995
11 papers in library cite
Ilya Sutskever, James Martens, Geoffrey E. Hinton - 2011
13 papers in library cite
F. Morin, Yoshua Bengio - 2005
19 papers in library cite
C. Chelba, Tomas Mikolov, M. Schuster, Q. Ge, T. Brants, P. Koehn, Tony Robinson - 2013
13 papers in library cite
A. Mnih, Geoffrey E. Hinton - 2009
16 papers in library cite
T. Luong, Ilya Sutskever, Quoc V. Le, Oriol Vinyals, Wojciech Zaremba - 2014
14 papers in library cite
Tomas Mikolov, Geoffrey Zweig - 2012
12 papers in library cite
Tomas Mikolov, A. Deoras, S. Kombrink, Lukas Burget, Jan Cernocky - 2011
13 papers in library cite
Yoshua Bengio, Jean Sebastien Senecal - 2008
6 papers in library cite
Amarnag Subramanya - 2016
2 papers in library cite
Yoshua Bengio, Jean Sebastien Senecal - 2003
11 papers in library cite
Alex Graves, Jürgen Schmidhuber - 2005
14 papers in library cite
N. Srivastava, E. Mansimov, Ruslan Salakhutdinov - 2015
3 papers in library cite
H. Sak, A. W. Senior, F. Beaufays - 2014
5 papers in library cite
Holger Schwenk, A. Rousseau, M. Attik - 2012
5 papers in library cite
Tomas Mikolov - 2012
17 papers in library cite
Yoon Kim, Yacine Jernite, D. Sontag, Alexander M. Rush - 2016
7 papers in library cite
M. Gutmann, A. Hyvarinen - 2010
7 papers in library cite
N. Srivastava - 2013
6 papers in library cite
Ronald J. Williams, J. Peng - 1990
5 papers in library cite
Ashish Vaswani, Y. Zhao, V. Fossum, D. Chiang - 2013
5 papers in library cite
W. Ling, C. Dyer, A. W. Black, I. Trancoso, R. Fermandez, S. Amir, L. Marujo, T. Luis - 2015
5 papers in library cite
R. Jozefowicz, Wojciech Zaremba, Ilya Sutskever - 2015
4 papers in library cite
Tianle Wang, Kyunghyun Cho - 2015
4 papers in library cite
A. Mnih, Koray Kavukcuoglu - 2013
4 papers in library cite
Yangfeng Ji, T. Cohn, L. Kong, C. Dyer, J. Eisenstein - 2015
3 papers in library cite
M. Ballesteros, C. Dyer, Noah A. Smith - 2015
3 papers in library cite
Yoshua Bengio, Holger Schwenk, Jean Sebastien Senecal, F. Morin, Jean Luc Gauvain - 2006
3 papers in library cite
Noam Shazeer, J. Pelemans, C. Chelba - 2015
3 papers in library cite
S. Ji, S. V. N. Vishwanathan, S. Nadathur, M. J. Anderson, P. Dubey - 2015
2 papers in library cite
E. Arisoy, T. N. Sainath, Brian Kingsbury, Bhuvana Ramabhadran - 2012
2 papers in library cite
Pascal Vincent, A. D. Brebisson, X. Bouthillier - 2015
2 papers in library cite
I. V. Serban, A. Sordoni, Yoshua Bengio, Aaron Courville, J. Pineau - 2015
1 paper in library cites
W. Williams, N. Prasad, D. Mrva, T. Ash, Tony Robinson - 2015
1 paper in library cites
K. Filippova, E. Alfonseca, C. A. Colmenares, Lukasz Kaiser, Oriol Vinyals - 2015
1 paper in library cites
Cited by
20
papers in your library
Cites
32
papers in your library
Read
on August 18, 2025
Your review
Tags
Paper Aliases
No aliases