2013
Cite Score
29
AI summary
This paper introduces various techniques, such as clipping gradients, leaky integration, advanced momentum, powerful output probability models, and sparse gradients, to improve the training and performance of recurrent neural networks (RNNs). The techniques are evaluated on text and music datasets, demonstrating improved training and test error.
Main Contributions
Abstract
After a more than decade-long period of relatively little research activity in the area of recurrent neural networks, several new developments will be reviewed here that have allowed substantial progress both in understanding and in technical solutions towards more efficient training of recurrent networks. These advances have been motivated by and related to the optimization issues surrounding deep learning. Although recurrent networks are extremely powerful in what they can in principle represent in terms of modeling sequences, their training is plagued by two aspects of the same issue regarding the learning of long-term dependencies. Experiments reported here evaluate the use of clipping gradients, spanning longer time ranges with leaky integration, advanced momentum techniques, using more powerful output probability models, and encouraging sparser gradients to help symmetry breaking and credit assignment. The experiments are performed on text and music data and show off the combined effects of these techniques in generally improving both training and test error.
Citation Graph
References [31]
Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton - 2012
71 papers in library cite
Sepp Hochreiter, Jürgen Schmidhuber - 1997
94 papers in library cite
D. E. Rumelhart, Geoffrey E. Hinton, Ronald J. Williams - 1986
34 papers in library cite
V. Nair, Geoffrey E. Hinton - 2010
18 papers in library cite
Geoffrey E. Hinton, S. Osindero, Y. Teh - 2006
43 papers in library cite
James Bergstra, Yoshua Bengio - 2012
7 papers in library cite
Yoshua Bengio, Patrice Simard, Paolo Frasconi - 1994
31 papers in library cite
Xavier Glorot, Antoine Bordes, Yoshua Bengio - 2011
17 papers in library cite
Yoshua Bengio - 2009
25 papers in library cite
Yoshua Bengio, P. Lamblin, D. Popovici, Hugo Larochelle - 2006
33 papers in library cite
Dumitru Erhan, Yoshua Bengio, Aaron Courville, Pierre Antoine Manzagol, Pascal Vincent, Samy Bengio - 2010
12 papers in library cite
Quoc V. Le, M. A. Ranzato, R. Monga, M. Devin, K. Chen, G. S. Corrado, Jeffrey Dean, Andrew Y. Ng - 2012
10 papers in library cite
Marc'aurelio Ranzato, C. Poultney, S. Chopra, Yann Lecun - 2006
20 papers in library cite
Tomas Mikolov, S. Kombrink, Lukas Burget, Jan Cernocky, Sanjeev Khudanpur - 2011
16 papers in library cite
James Martens - 2010
12 papers in library cite
Pascal Vincent - 2012
8 papers in library cite
James Martens, Ilya Sutskever - 2011
13 papers in library cite
Tomas Mikolov, Geoffrey Zweig - 2012
12 papers in library cite
Yoshua Bengio, Aaron Courville, Pascal Vincent - 2013
2 papers in library cite
Sepp Hochreiter - 1991
18 papers in library cite
Tomas Mikolov - 2012
17 papers in library cite
S. Elhihi, Yoshua Bengio - 1996
6 papers in library cite
Hugo Larochelle, I. Murray - 2011
5 papers in library cite
Ilya Sutskever, Geoffrey Hinton, G. Taylor - 2008
5 papers in library cite
T. Lin, B. G. Horne, P. Tino, C. L. Giles - 1995
4 papers in library cite
Herbert Jaeger, M. Lukosevicius, D. Popovici, U. Siewert - 2007
3 papers in library cite
Ilya Sutskever - 2012
3 papers in library cite
Razvan Pascanu, Tomas Mikolov, Yoshua Bengio - 2012
3 papers in library cite
Y. Nesterov - 1983
1 paper in library cites
U. Siewert, W. Wustlich - 2007
1 paper in library cites
Ilya Sutskever, Geoffrey Hinton - 2010
1 paper in library cites
Cited by
4
papers in your library
Cites
19
papers in your library
Read
on June 21, 2025
Your review
Tags
Paper Aliases
No aliases