2012
Cite Score
46
AI summary
This paper introduces transformations to multi-layer perceptrons, making hidden neuron outputs zero mean and slope, and using shortcut connections for linear dependencies, which enhances basic stochastic gradient learning on MNIST classification and autoencoder tasks by improving convergence and generalization.
Main Contributions
Abstract
We transform the outputs of each hidden neuron in a multi-layer perceptron network to be zero mean and zero slope, and use separate shortcut connections to model the linear dependencies instead. This transformation aims at separating the problems of learning the linear and nonlinear parts of the whole input-output mapping, which has many benefits. We study the theoretical properties of the transformation by noting that they make the Fisher information matrix closer to a diagonal matrix, and thus standard gradient closer to the natural gradient. We experimentally confirm the usefulness of the transformations by noting that they make basic stochastic gradient learning competitive with state-of-the-art learning algorithms in speed, and that they seem also to help find solutions that generalize better. The experiments include both classification of handwritten digits with a 3- layer network and learning a low-dimensional representation for images by using a 6-layer auto-encoder network. The transformations were beneficial in all cases, with and without regularization.
Citation Graph
References [13]
Yann Lecun, Leon Bottou, Yoshua Bengio, Patrick Haffner - 1998
62 papers in library cite
Yoshua Bengio - 2010
20 papers in library cite
Geoffrey Hinton, Ruslan Salakhutdinov - 2006
37 papers in library cite
Pascal Vincent, Hugo Larochelle, Yoshua Bengio, Pierre Antoine Manzagol - 2008
25 papers in library cite
Yann Lecun, Leon Bottou, G. B. Orr, Klaus Robert Muller - 1998
20 papers in library cite
James Martens - 2010
12 papers in library cite
Dan C. Ciresan, Ueli Meier, Luca M. Gambardella, Jürgen Schmidhuber - 2010
10 papers in library cite
S. I. Amari - 1998
6 papers in library cite
R. Battiti - 1992
3 papers in library cite
N. Leroux, Pierre Antoine Manzagol, Yoshua Bengio - 2008
2 papers in library cite
A. Krogh, J. Hertz - 1992
1 paper in library cites
S. Rifai, Xavier Glorot, Yoshua Bengio, Pascal Vincent - 2011
1 paper in library cites
N. Schraudolph - 1998
1 paper in library cites
Cited by
7
papers in your library
Cites
7
papers in your library
Read
on February 17, 2026
Your review
Tags
Paper Aliases
No aliases