2015

Visualizing and Understanding Recurrent Networks

Li Fei Fei

citations

Cite Score

47

AI summary

This paper introduces an analysis of LSTMs using character-level language models as an interpretable testbed. The analysis reveals interpretable cells tracking long-range dependencies, quantifies LSTM predictions with comparisons to n-gram models, and provides an error analysis to identify areas for further study.

Main Contributions

  • Reveals the existence of interpretable cells that keep track of long-range dependencies such as line lengths, quotes and brackets.
  • Quantifies LSTM predictions with comprehensive comparison to n-gram models, showing LSTMs perform significantly better on characters that require long-range reasoning.
  • Conducts an error analysis using a sequence of oracles to quantify the extent of remaining errors and suggest specific areas for further study.
  • Demonstrates that LSTMs can effectively utilize information beyond 20 characters through error analysis and comparisons with n-gram models.
  • Shows that the LSTM "grows" its competence over increasingly longer dependencies during training.

Abstract

Recurrent Neural Networks (RNNs), and specifically a variant with Long Short-Term Memory (LSTM), are enjoying renewed interest as a result of successful applications in a wide range of machine learning problems that involve sequential data. However, while LSTMs provide exceptional results in practice, the source of their performance and their limitations remain rather poorly understood. Using character-level language models as an interpretable testbed, we aim to bridge this gap by providing an analysis of their representations, predictions and error types. In particular, our experiments reveal the existence of interpretable cells that keep track of long-range dependencies such as line lengths, quotes and brackets. Moreover, our comparative analysis with finite horizon n-gram models traces the source of the LSTM improvements to long-range structural dependencies. Finally, we provide analysis of the remaining errors and suggests areas for further study.

Citation Graph

Loading graph...

References [35]

Sort:
Filter:

Sepp Hochreiter, Jürgen Schmidhuber - 1997

94 papers in library cite

Geoffrey Hinton - 2008

7 papers in library cite

D. Bahdanau, Kyunghyun Cho, Yoshua Bengio - 2014

59 papers in library cite

D. E. Rumelhart, Geoffrey E. Hinton, Ronald J. Williams - 1986

46 papers in library cite

Ilya Sutskever, Oriol Vinyals, Quoc V. Le - 2014

58 papers in library cite

J. Chung, C. G. Gulcehre, Kyunghyun Cho, Yoshua Bengio - 2014

11 papers in library cite

Yoshua Bengio, Patrice Simard, Paolo Frasconi - 1994

31 papers in library cite

Geoffrey Hinton - 2013

13 papers in library cite

M. P. Marcus, B. Santorini, Mary Ann Marcinkiewicz - 1993

22 papers in library cite

Kyunghyun Cho, B. V. Merrienboer, D. Bahdanau, Yoshua Bengio - 2014

9 papers in library cite

Dumitru Erhan - 2015

11 papers in library cite

K. Greff, R. K. Srivastava, J. Koutn'ik, B. R. Steunebrink, Jürgen Schmidhuber - 2015

4 papers in library cite

Razvan Pascanu, Tomas Mikolov, Yoshua Bengio - 2013

21 papers in library cite

Tomas Mikolov, M. Karafiat, Lukas Burget, Jan Cernocky, Sanjeev Khudanpur - 2010

36 papers in library cite

Alex Graves - 2013

27 papers in library cite

Alex Graves, G. Wayne, Ivo Danihelka - 2014

18 papers in library cite

Jason Weston, S. Chopra, Antoine Bordes - 2015

18 papers in library cite

Ilya Sutskever, James Martens, Geoffrey E. Hinton - 2011

13 papers in library cite

A. M. Dai, Quoc V. Le - 2015

27 papers in library cite

Paul J. Werbos - 1988

11 papers in library cite

Razvan Pascanu, C. G. Gulcehre, Kyunghyun Cho, Yoshua Bengio - 2013

7 papers in library cite

Armand Joulin, Tomas Mikolov - 2015

9 papers in library cite

Jürgen Schmidhuber - 2015

2 papers in library cite

J. Donahue, L. Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan, K. Saenko, Trevor Darrell - 2014

4 papers in library cite

A. Karpathy, Li Fei Fei - 2014

6 papers in library cite

Tomas Mikolov - 2012

17 papers in library cite

S. F. Chen, J. Goodman - 1998

13 papers in library cite

R. Jozefowicz, Wojciech Zaremba, Ilya Sutskever - 2015

4 papers in library cite

D. Hoiem, Y. Chodpathumwan, Q. Dai - 2012

4 papers in library cite

M. Hutter - 2012

4 papers in library cite

Frederick Jelinek, B. Merialdo, S. Roukos, M. Strauss - 1991

3 papers in library cite

Yann N. Dauphin, H. D. Vries, J. Chung, Yoshua Bengio - 2015

2 papers in library cite

K. Heafield, I. Pouzyrevsky, J. H. Clark, P. Koehn - 2013

2 papers in library cite

M. Hermans, B. Schrauwen - 2013

2 papers in library cite

X. Huang, Alex Acero, H. W. Hon - 2001

1 paper in library cites

Cited by

3

papers in your library

Cites

25

papers in your library

Read

on October 30, 2025

Your review

Tags

Paper Aliases

No aliases