Cite Score
47
AI summary
This paper introduces an analysis of LSTMs using character-level language models as an interpretable testbed. The analysis reveals interpretable cells tracking long-range dependencies, quantifies LSTM predictions with comparisons to n-gram models, and provides an error analysis to identify areas for further study.
Main Contributions
Abstract
Recurrent Neural Networks (RNNs), and specifically a variant with Long Short-Term Memory (LSTM), are enjoying renewed interest as a result of successful applications in a wide range of machine learning problems that involve sequential data. However, while LSTMs provide exceptional results in practice, the source of their performance and their limitations remain rather poorly understood. Using character-level language models as an interpretable testbed, we aim to bridge this gap by providing an analysis of their representations, predictions and error types. In particular, our experiments reveal the existence of interpretable cells that keep track of long-range dependencies such as line lengths, quotes and brackets. Moreover, our comparative analysis with finite horizon n-gram models traces the source of the LSTM improvements to long-range structural dependencies. Finally, we provide analysis of the remaining errors and suggests areas for further study.
Citation Graph
References [35]
Sepp Hochreiter, Jürgen Schmidhuber - 1997
94 papers in library cite
Geoffrey Hinton - 2008
7 papers in library cite
D. Bahdanau, Kyunghyun Cho, Yoshua Bengio - 2014
59 papers in library cite
D. E. Rumelhart, Geoffrey E. Hinton, Ronald J. Williams - 1986
46 papers in library cite
Ilya Sutskever, Oriol Vinyals, Quoc V. Le - 2014
58 papers in library cite
J. Chung, C. G. Gulcehre, Kyunghyun Cho, Yoshua Bengio - 2014
11 papers in library cite
Yoshua Bengio, Patrice Simard, Paolo Frasconi - 1994
31 papers in library cite
Geoffrey Hinton - 2013
13 papers in library cite
M. P. Marcus, B. Santorini, Mary Ann Marcinkiewicz - 1993
22 papers in library cite
Kyunghyun Cho, B. V. Merrienboer, D. Bahdanau, Yoshua Bengio - 2014
9 papers in library cite
Dumitru Erhan - 2015
11 papers in library cite
K. Greff, R. K. Srivastava, J. Koutn'ik, B. R. Steunebrink, Jürgen Schmidhuber - 2015
4 papers in library cite
Razvan Pascanu, Tomas Mikolov, Yoshua Bengio - 2013
21 papers in library cite
Tomas Mikolov, M. Karafiat, Lukas Burget, Jan Cernocky, Sanjeev Khudanpur - 2010
36 papers in library cite
Alex Graves - 2013
27 papers in library cite
Alex Graves, G. Wayne, Ivo Danihelka - 2014
18 papers in library cite
Jason Weston, S. Chopra, Antoine Bordes - 2015
18 papers in library cite
Ilya Sutskever, James Martens, Geoffrey E. Hinton - 2011
13 papers in library cite
A. M. Dai, Quoc V. Le - 2015
27 papers in library cite
Paul J. Werbos - 1988
11 papers in library cite
Razvan Pascanu, C. G. Gulcehre, Kyunghyun Cho, Yoshua Bengio - 2013
7 papers in library cite
Armand Joulin, Tomas Mikolov - 2015
9 papers in library cite
Jürgen Schmidhuber - 2015
2 papers in library cite
J. Donahue, L. Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan, K. Saenko, Trevor Darrell - 2014
4 papers in library cite
A. Karpathy, Li Fei Fei - 2014
6 papers in library cite
Tomas Mikolov - 2012
17 papers in library cite
S. F. Chen, J. Goodman - 1998
13 papers in library cite
R. Jozefowicz, Wojciech Zaremba, Ilya Sutskever - 2015
4 papers in library cite
D. Hoiem, Y. Chodpathumwan, Q. Dai - 2012
4 papers in library cite
M. Hutter - 2012
4 papers in library cite
Frederick Jelinek, B. Merialdo, S. Roukos, M. Strauss - 1991
3 papers in library cite
Yann N. Dauphin, H. D. Vries, J. Chung, Yoshua Bengio - 2015
2 papers in library cite
K. Heafield, I. Pouzyrevsky, J. H. Clark, P. Koehn - 2013
2 papers in library cite
M. Hermans, B. Schrauwen - 2013
2 papers in library cite
X. Huang, Alex Acero, H. W. Hon - 2001
1 paper in library cites
Cited by
3
papers in your library
Cites
25
papers in your library
Read
on October 30, 2025
Your review
Tags
Paper Aliases
No aliases