2000
Cite Score
83
AI summary
This paper introduces a forget gate mechanism for LSTM networks to address the issue of unbounded growth in cell states when processing continual input streams, achieving improved performance on a continual version of the embedded Reber grammar benchmark.
Main Contributions
Abstract
Long Short-Term Memory (LSTM, [5]) can solve many tasks not solvable by previous learning algorithms for recurrent neural networks (RNNs). We identify a weakness of LSTM networks processing continual input streams without explicitly marked sequence ends. Without resets, the internal state values may grow indefinitely and eventually cause the network to break down. Our remedy is an adaptive "forget gate" that enables an LSTM cell to learn to reset itself at appropriate times, thus releasing internal resources. We review an illustrative benchmark problem on which standard LSTM outperforms other RNN algorithms. All algorithms (including LSTM) fail to solve a continual version of that problem. LSTM with forget gates, however, easily solves it in an elegant way.
Citation Graph
References [7]
Sepp Hochreiter, Jürgen Schmidhuber - 1997
94 papers in library cite
Yoshua Bengio, Patrice Simard, Paolo Frasconi - 1994
31 papers in library cite
Ronald J. Williams, David Zipser - 1992
8 papers in library cite
A. J. Robinson, F. Fallside - 1987
10 papers in library cite
A. Cleeremans, D. S. Schreiber, J. L. Mcclelland - 1989
4 papers in library cite
A. W. Smith, David Zipser - 1989
4 papers in library cite
S. E. Fahlman - 1991
3 papers in library cite
Cited by
13
papers in your library
Cites
3
papers in your library
Read
on April 25, 2025
Your review
Tags
Paper Aliases
No aliases