2016

Exploring the Limits of Language Modeling

R. Jozefowicz, Oriol Vinyals, M. Schuster, Noam Shazeer, Yonghui Wu

citations

Cite Score

47

AI summary

This paper introduces an extension of Recurrent Neural Networks (RNNs) for large-scale language modeling using character Convolutional Neural Networks or Long-Short Term Memory on the One Billion Word Benchmark, achieving a perplexity of 23.7 using model ensembling.

Main Contributions

  • Explored and extended current research on large scale language modeling
  • Designed a Softmax loss based on character level CNNs that is efficient and precise.
  • Improved the state-of-the-art on a well known, large scale LM task, reducing perplexity from 51.3 down to 30.0.
  • Demonstrated that an ensemble of different models can reduce perplexity to 23.7.
  • Shared the models and recipes to help and motivate further research.

Abstract

In this work we explore recent advances in Recurrent Neural Networks for large scale Language Modeling, a task central to language understanding. We extend current models to deal with two key challenges present in this task: corpora and vocabulary sizes, and complex, long term structure of language. We perform an exhaustive study on techniques such as character Convolutional Neural Networks or Long-Short Term Memory, on the One Billion Word Benchmark. Our best single model significantly improves state-of-the-art perplexity from 51.3 down to 30.0 (whilst reducing the number of parameters by a factor of 20), while an ensemble of models sets a new record by improving perplexity from 41.0 down to 23.7. We also release these models for the NLP and ML community to study and improve upon.

Citation Graph

Loading graph...

References [53]

Sort:
Filter:

Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton - 2012

71 papers in library cite

Sepp Hochreiter, Jürgen Schmidhuber - 1997

94 papers in library cite

J. Deng, W. Dong, Richard Socher, L. J. Li, K. Li, Li Fei Fei - 2009

28 papers in library cite

Kyunghyun Cho, B. V. Merrienboer, C. G. Gulcehre, D. Bahdanau, F. Bougares, Holger Schwenk, Yoshua Bengio - 2014

38 papers in library cite

Ilya Sutskever, Oriol Vinyals, Quoc V. Le - 2014

58 papers in library cite

M. Abadi, Akshat Agarwal, P. Barham, E. Brevdo, Ziru Chen, C. Citro, G. Corrado, A. Davis, Jeffrey Dean, M. Devin, Sanjay Ghemawat, I. Goodfellow, A. Harp, Geoffrey Irving, M. Isard, Y. Jia, R. Jozefowicz, Lukasz Kaiser, M. Kudlur, J. Levenberg, D. Mane, R. Monga, S. Moore, D. Murray, Christopher Olah, M. Schuster, J. Shlens, B. Steiner, Ilya Sutskever, K. Talwar, P. Tucker, Vincent Vanhoucke, V. Vasudevan, F. Viegas, Oriol Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, Xiaoqiang Zheng - 2015

11 papers in library cite

M. Schuster, Kuldip K. Paliwal - 1997

10 papers in library cite

M. P. Marcus, B. Santorini, Mary Ann Marcinkiewicz - 1993

22 papers in library cite

Felix A. Gers, Jürgen Schmidhuber, Fred Cummins - 2000

13 papers in library cite

Razvan Pascanu, Tomas Mikolov, Yoshua Bengio - 2013

21 papers in library cite

Tomas Mikolov, M. Karafiat, Lukas Burget, Jan Cernocky, Sanjeev Khudanpur - 2010

36 papers in library cite

Yann Lecun, B. Boser, John S. Denker, D. Henderson, R. E. Howard, W. Hubbard, L. D. Jackel - 1990

10 papers in library cite

Alex Graves - 2013

27 papers in library cite

Phil Blunsom, Edward Grefenstette, N. Kalchbrenner - 2014

7 papers in library cite

Wojciech Zaremba, Ilya Sutskever, Oriol Vinyals - 2014

22 papers in library cite

Alexander M. Rush, S. Chopra, Jason Weston - 2015

13 papers in library cite

R. K. Srivastava, K. Greff, Jürgen Schmidhuber - 2015

6 papers in library cite

Oriol Vinyals, Quoc V. Le - 2015

7 papers in library cite

R. Kneser, Hermann Ney - 1995

11 papers in library cite

Ilya Sutskever, James Martens, Geoffrey E. Hinton - 2011

13 papers in library cite

F. Morin, Yoshua Bengio - 2005

19 papers in library cite

C. Chelba, Tomas Mikolov, M. Schuster, Q. Ge, T. Brants, P. Koehn, Tony Robinson - 2013

13 papers in library cite

A. Mnih, Geoffrey E. Hinton - 2009

16 papers in library cite

T. Luong, Ilya Sutskever, Quoc V. Le, Oriol Vinyals, Wojciech Zaremba - 2014

14 papers in library cite

Tomas Mikolov, Geoffrey Zweig - 2012

12 papers in library cite

Tomas Mikolov, A. Deoras, S. Kombrink, Lukas Burget, Jan Cernocky - 2011

13 papers in library cite

Yoshua Bengio, Jean Sebastien Senecal - 2008

6 papers in library cite

Amarnag Subramanya - 2016

2 papers in library cite

Yoshua Bengio, Jean Sebastien Senecal - 2003

11 papers in library cite

Alex Graves, Jürgen Schmidhuber - 2005

14 papers in library cite

N. Srivastava, E. Mansimov, Ruslan Salakhutdinov - 2015

3 papers in library cite

H. Sak, A. W. Senior, F. Beaufays - 2014

5 papers in library cite

Holger Schwenk, A. Rousseau, M. Attik - 2012

5 papers in library cite

Tomas Mikolov - 2012

17 papers in library cite

Yoon Kim, Yacine Jernite, D. Sontag, Alexander M. Rush - 2016

7 papers in library cite

M. Gutmann, A. Hyvarinen - 2010

7 papers in library cite

N. Srivastava - 2013

6 papers in library cite

Ronald J. Williams, J. Peng - 1990

5 papers in library cite

Ashish Vaswani, Y. Zhao, V. Fossum, D. Chiang - 2013

5 papers in library cite

W. Ling, C. Dyer, A. W. Black, I. Trancoso, R. Fermandez, S. Amir, L. Marujo, T. Luis - 2015

5 papers in library cite

R. Jozefowicz, Wojciech Zaremba, Ilya Sutskever - 2015

4 papers in library cite

Tianle Wang, Kyunghyun Cho - 2015

4 papers in library cite

A. Mnih, Koray Kavukcuoglu - 2013

4 papers in library cite

Yangfeng Ji, T. Cohn, L. Kong, C. Dyer, J. Eisenstein - 2015

3 papers in library cite

M. Ballesteros, C. Dyer, Noah A. Smith - 2015

3 papers in library cite

Yoshua Bengio, Holger Schwenk, Jean Sebastien Senecal, F. Morin, Jean Luc Gauvain - 2006

3 papers in library cite

Noam Shazeer, J. Pelemans, C. Chelba - 2015

3 papers in library cite

S. Ji, S. V. N. Vishwanathan, S. Nadathur, M. J. Anderson, P. Dubey - 2015

2 papers in library cite

E. Arisoy, T. N. Sainath, Brian Kingsbury, Bhuvana Ramabhadran - 2012

2 papers in library cite

Pascal Vincent, A. D. Brebisson, X. Bouthillier - 2015

2 papers in library cite

I. V. Serban, A. Sordoni, Yoshua Bengio, Aaron Courville, J. Pineau - 2015

1 paper in library cites

W. Williams, N. Prasad, D. Mrva, T. Ash, Tony Robinson - 2015

1 paper in library cites

K. Filippova, E. Alfonseca, C. A. Colmenares, Lukasz Kaiser, Oriol Vinyals - 2015

1 paper in library cites

Cited by

20

papers in your library

Cites

32

papers in your library

Read

on August 18, 2025

Your review

Tags

Paper Aliases

No aliases