2016

Improving Neural Language Models With a Continuous Cache

E. Grave, Armand Joulin, Nicolas Usunier

citations

Cite Score

17

AI summary

This paper introduces a neural cache model for neural language models that stores recent hidden activations and accesses them through a dot product. It requires no training and scales effortlessly to thousands of memory cells, demonstrating improved performance on language model tasks and the LAMBADA dataset.

Main Contributions

  • Proposes a continuous version of the cache model, called Neural Cache Model, that can be adapted to any neural network language model.
  • Stores recent hidden activations and uses them as representation for the context using simply a dot-product with the current hidden activations.
  • The model requires no training and can be used on any pre-trained neural networks.
  • The model scales effortlessly to thousands of memory cells.
  • Demonstrates the quality of the Neural Cache models on several language model tasks and the LAMBADA dataset.

Abstract

We propose an extension to neural network language models to adapt their prediction to the recent history. Our model is a simplified version of memory augmented networks, which stores past hidden activations as memory and accesses them through a dot product with the current hidden activation. This mechanism is very efficient and scales to very large memory sizes. We also draw a link between the use of external memory in neural network and cache models used with count based language models. We demonstrate on several language model datasets that our approach performs significantly better than recent memory augmented networks.

Citation Graph

Loading graph...

References [50]

Sort:
Filter:

Sepp Hochreiter, Jürgen Schmidhuber - 1997

94 papers in library cite

D. Bahdanau, Kyunghyun Cho, Yoshua Bengio - 2014

59 papers in library cite

J. Chung, C. G. Gulcehre, Kyunghyun Cho, Yoshua Bengio - 2014

11 papers in library cite

Jeffrey L. Elman - 1990

23 papers in library cite

John Duchi, Elad Hazan, Yoram Singer - 2011

19 papers in library cite

Geoffrey Hinton - 2013

13 papers in library cite

Yoshua Bengio, R. Ducharme, Pascal Vincent - 2001

62 papers in library cite

M. P. Marcus, B. Santorini, Mary Ann Marcinkiewicz - 1993

22 papers in library cite

Tomas Mikolov, M. Karafiat, Lukas Burget, Jan Cernocky, Sanjeev Khudanpur - 2010

36 papers in library cite

P. Werbos - 1990

9 papers in library cite

K. M. Hermann, T. Kocisky, Edward Grefenstette, L. Espeholt, W. Kay, M. Suleyman, Phil Blunsom - 2015

31 papers in library cite

Oriol Vinyals, M. Fortunato, Navdeep Jaitly - 2015

10 papers in library cite

Wojciech Zaremba, Ilya Sutskever, Oriol Vinyals - 2014

22 papers in library cite

S. Merity, Caiming Xiong, J. Bradbury, Richard Socher - 2017

12 papers in library cite

Alex Graves, G. Wayne, Ivo Danihelka - 2014

18 papers in library cite

S. Sukhbaatar, A. Szlam, Jason Weston, Rob Fergus - 2015

18 papers in library cite

R. Kneser, Hermann Ney - 1995

11 papers in library cite

Yarin Gal - 2015

9 papers in library cite

R. Jozefowicz, Oriol Vinyals, M. Schuster, Noam Shazeer, Yonghui Wu - 2016

20 papers in library cite

D. Paperno, German Kruszewski, A. Lazaridou, N. Q. Pham, R. Bernardi, S. Pezzelle, M. Baroni, G. Boleda, Raquel Fernandez - 2016

12 papers in library cite

Tomas Mikolov, Geoffrey Zweig - 2012

12 papers in library cite

Deli Chen, J. Bolton, Christopher D. Manning - 2016

9 papers in library cite

Armand Joulin, Tomas Mikolov - 2015

9 papers in library cite

Tomas Mikolov, A. Deoras, S. Kombrink, Lukas Burget, Jan Cernocky - 2011

13 papers in library cite

E. Grave, Armand Joulin, M. Cisse, D. Grangier, Hervé Jégou - 2017

4 papers in library cite

C. G. Gulcehre, S. Ahn, R. Nallapati, B. Zhou, Yoshua Bengio - 2016

7 papers in library cite

J. G. Zilly, R. K. Srivastava, J. Koutnik, Jürgen Schmidhuber - 2016

6 papers in library cite

R. Kadlec, M. Schmid, O. Bajgar, Jan Kleindienst - 2016

7 papers in library cite

Tomas Mikolov, Armand Joulin, S. Chopra, M. Mathieu, Marc'aurelio Ranzato - 2015

8 papers in library cite

J. Dodge, A. Gane, X. Zhang, Antoine Bordes, S. Chopra, A. Miller, A. Szlam, Jason Weston - 2015

4 papers in library cite

J. Goodman - 2001

15 papers in library cite

P. F. Brown, S. D. Pietra, Vincent J. Della Pietra, R. L. Mercer - 1993

7 papers in library cite

R. Kuhn, R. D. Mori - 1990

6 papers in library cite

R. Rosenfeld - 1996

6 papers in library cite

Ronald J. Williams, J. Peng - 1990

5 papers in library cite

Edward Grefenstette, K. Hermann, M. Suleyman, Phil Blunsom - 2015

5 papers in library cite

L. R. Bahl, Frederick Jelinek, R. L. Mercer - 1983

4 papers in library cite

Tianle Wang, Kyunghyun Cho - 2015

4 papers in library cite

Frederick Jelinek, B. Merialdo, S. Roukos, M. Strauss - 1991

3 papers in library cite

R. Kneser, V. Steinbiss - 1993

3 papers in library cite

J. Bellegarda - 2000

2 papers in library cite

Sanjeev Khudanpur, Jeffrey Wu - 2000

2 papers in library cite

R. M. Iyer, M. Ostendorf - 1999

2 papers in library cite

N. Coccaro, Dan Jurafsky - 1998

2 papers in library cite

R. Lau, R. Rosenfeld, S. Roukos - 1993

2 papers in library cite

S. D. Pietra, V. D. Pietra, R. L. Mercer, S. Roukos - 1992

1 paper in library cites

Andreas Stolcke, N. Coccaro, R. Bates, P. Taylor, C. V. E. Dykema, K. Ries, E. Shriberg, Dan Jurafsky, R. Martin, M. Meteer - 2000

1 paper in library cites

J. Kupiec - 1989

1 paper in library cites

Cited by

7

papers in your library

Cites

30

papers in your library

Read

on October 31, 2025

Your review

Tags

Paper Aliases

No aliases