2011
Cite Score
33
AI summary
This paper introduces a Hessian-Free optimization approach with structural damping to train RNNs, achieving state-of-the-art results on synthetic datasets and real-world sequence datasets, outperforming LSTMs in motion video prediction, music modeling, and speech modeling.
Main Contributions
Abstract
In this work we resolve the long-outstanding problem of how to effectively train recurrent neural networks (RNNs) on complex and difficult sequence modeling problems which may contain long-term data dependencies. Utilizing recent advances in the Hessian-free optimization approach (Martens, 2010), together with a novel damping scheme, we successfully train RNNs on two sets of challenging problems. First, a collection of pathological synthetic datasets which are known to be impossible for standard optimization approaches (due to their extremely long-term dependencies), and second, on three natural and highly complex real-world sequence datasets where we find that our method significantly outperforms the previous state-of-the-art method for training neural sequence models: the Long Short-term Memory approach of Hochreiter and Schmidhuber (1997). Additionally, we offer a new interpretation of the generalized Gauss-Newton matrix of Schraudolph (2002) which is used within the HF approach of Martens.
Citation Graph
References [17]
Sepp Hochreiter, Jürgen Schmidhuber - 1997
94 papers in library cite
D. E. Rumelhart, Geoffrey E. Hinton, Ronald J. Williams - 1986
34 papers in library cite
Geoffrey Hinton, Ruslan Salakhutdinov - 2006
37 papers in library cite
Yoshua Bengio, Patrice Simard, Paolo Frasconi - 1994
31 papers in library cite
Felix A. Gers, Jürgen Schmidhuber, Fred Cummins - 2000
13 papers in library cite
Herbert Jaeger, Harald Haas - 2004
4 papers in library cite
Alex Graves, Jürgen Schmidhuber - 2009
5 papers in library cite
James Martens - 2010
12 papers in library cite
Alex Graves, Jürgen Schmidhuber - 2005
14 papers in library cite
Sepp Hochreiter - 1991
18 papers in library cite
N. N. Schraudolph - 2002
4 papers in library cite
B. Pearlmutter - 1994
4 papers in library cite
Sepp Hochreiter, Jürgen Schmidhuber - 1996
3 papers in library cite
Kevin P. Murphy - 2002
2 papers in library cite
J. Nocedal, S. Wright - 1999
2 papers in library cite
Sepp Hochreiter, Yoshua Bengio, Paolo Frasconi, Jürgen Schmidhuber - 2001
1 paper in library cites
H. Mayer, Faustino Gomez, Daan Wierstra, I. Nagy, A. Knoll, Jürgen Schmidhuber - 2007
1 paper in library cites
Cited by
13
papers in your library
Cites
9
papers in your library
Read
on July 11, 2025
Your review
Tags
Paper Aliases
No aliases