Cite Score
79
AI summary
This paper analyzes back-propagation learning, offering tricks and explanations. It finds classical second-order methods impractical for large neural networks, suggesting alternatives. It emphasizes convergence improvement through techniques like stochastic learning, input normalization, and appropriate sigmoid functions, enhancing neural network training efficiency.
Main Contributions
Abstract
The convergence of back-propagation learning is analyzed so as to explain common phenomenon observed by practitioners. Many undesirable behaviors of backprop can be avoided with tricks that are rarely exposed in serious technical publications. This paper gives some of those tricks, and offers explanations of why they work. Many authors have suggested that second-order optimization methods are advantageous for neural net training. It is shown that most "classical" second-order methods are impractical for large neural networks. A few methods are proposed that do not have these limitations.
Citation Graph
References [44]
Yann Lecun, B. Boser, John S. Denker, D. Henderson, R. E. Howard, W. Hubbard, L. D. Jackel - 1990
10 papers in library cite
Yann Lecun, John Denker, Sara Solla, Richard Howard, Lawrence Jackel - 1990
4 papers in library cite
A. H. Waibel, T. Hanazawa, Geoffrey Hinton, K. Shikano, K. Lang - 1989
13 papers in library cite
Robert A. Jacobs - 1988
4 papers in library cite
Yann Lecun - 1989
5 papers in library cite
S. Becker, Yann Lecun - 1988
9 papers in library cite
C. M. Bishop - 1995
12 papers in library cite
V. N. Vapnik - 1998
10 papers in library cite
V. Vapnik - 1995
9 papers in library cite
S. I. Amari - 1998
6 papers in library cite
B. Pearlmutter - 1994
4 papers in library cite
Leon Bottou - 1998
4 papers in library cite
R. Battiti - 1992
3 papers in library cite
G. B. Orr - 1996
2 papers in library cite
A. H. Kramer, A. S. Vincentelli - 1988
2 papers in library cite
J. Moody, C. Darken - 1989
2 papers in library cite
Yann Lecun - 1987
2 papers in library cite
W. Wiegerinck, A. Komoda, T. Heskes - 1994
2 papers in library cite
N. Murata - 1992
1 paper in library cites
M. Moller - 1993
1 paper in library cites
Richard S. Sutton - 1992
1 paper in library cites
N. Murata, Klaus Robert Muller, A. Ziehe, S. Amari - 1997
1 paper in library cites
Yann Lecun, Patrice Y. Simard, B. Pearlmutter - 1993
1 paper in library cites
W. L. Buntine, A. S. Weigend - 1993
1 paper in library cites
A. V. Oppenheim, R. W. Schafer - 1975
1 paper in library cites
D. Saad, Sara A. Solla - 1995
1 paper in library cites
G. H. Golub, C. F. V. Loan - 1989
1 paper in library cites
L. Goldstein - 1987
1 paper in library cites
P. V. D. Smagt - 1994
1 paper in library cites
D. S. Broomhead, D. Lowe - 1988
1 paper in library cites
S. Amari - 1997
1 paper in library cites
S. Geman, E. Bienenstock, R. Doursat - 1992
1 paper in library cites
C. Darken, J. E. Moody - 1991
1 paper in library cites
W. H. Press, B. P. Flannery, S. A. Teukolsky, W. T. Vetterling - 1988
1 paper in library cites
H. Sompolinsky, N. Barkai, H. S. Seung - 1995
1 paper in library cites
T. M. Heskes, B. Kappen - 1993
1 paper in library cites
D. Saad - 1998
1 paper in library cites
R. Fletcher - 1987
1 paper in library cites
K. I. Diamantaras, S. Y. Kung - 1996
1 paper in library cites
M. J. L. Orr - 1995
1 paper in library cites
G. B. Orr - 1997
1 paper in library cites
Yann Lecun, I. Kanter, Sara A. Solla - 1991
1 paper in library cites
M. Moller - 1993
1 paper in library cites
H. H. Yang, S. Amari - 1998
1 paper in library cites
Cited by
20
papers in your library
Cites
8
papers in your library
Read
on June 26, 2025
Your review
Tags
Paper Aliases
No aliases