Cite Score
89
AI summary
This paper introduces layer normalization, a new normalization method for neural networks, which computes normalization statistics from summed inputs within a layer on a single training case, improving training speed and generalization performance for RNN models.
Main Contributions
Abstract
Training state-of-the-art, deep neural networks is computationally expensive. One way to reduce the training time is to normalize the activities of the neurons. A recently introduced technique called batch normalization uses the distribution of the summed input to a neuron over a mini-batch of training cases to compute a mean and variance which are then used to normalize the summed input to that neuron on each training case. This significantly reduces the training time in feed-forward neural networks. However, the effect of batch normalization is dependent on the mini-batch size and it is not obvious how to apply it to recurrent neural networks. In this paper, we transpose batch normalization into layer normalization by computing the mean and variance used for normalization from all of the summed inputs to the neurons in a layer on a single training case. Like batch normalization, we also give each neuron its own adaptive bias and gain which are applied after the normalization but before the non-linearity. Unlike batch normalization, layer normalization performs exactly the same computation at training and test times. It is also straightforward to apply to recurrent neural networks by computing the normalization statistics separately at each time step. Layer normalization is very effective at stabilizing the hidden state dynamics in recurrent networks. Empirically, we show that layer normalization can substantially reduce the training time compared with previously published techniques.
Citation Graph
References [32]
D. P. Kingma, Jimmy Lei Ba - 2014
49 papers in library cite
K. Simonyan, Andrew Zisserman - 2014
20 papers in library cite
Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton - 2012
71 papers in library cite
S. Ioffe, Christian Szegedy - 2015
18 papers in library cite
T. Y. Lin, M. Maire, S. Belongie, James Hays, Pietro Perona, D. Ramanan, Piotr Dollar, C. L. Zitnick - 2014
14 papers in library cite
Tomas Mikolov, K. Chen, G. S. Corrado, Jeffrey Dean - 2013
26 papers in library cite
Kyunghyun Cho, B. V. Merrienboer, C. G. Gulcehre, D. Bahdanau, F. Bougares, Holger Schwenk, Yoshua Bengio - 2014
38 papers in library cite
Ilya Sutskever, Oriol Vinyals, Quoc V. Le - 2014
58 papers in library cite
Alex Graves - 2013
27 papers in library cite
Jeffrey Dean, G. S. Corrado, R. Monga, K. Chen, M. Devin, Quoc V. Le, Mark Z. Mao, Marc'aurelio Ranzato, A. Senior, P. Tucker, K. Yang, Andrew Y. Ng - 2012
16 papers in library cite
K. M. Hermann, T. Kocisky, Edward Grefenstette, L. Espeholt, W. Kay, M. Suleyman, Phil Blunsom - 2015
31 papers in library cite
Yuxuan Zhu, R. Kiros, R. Zemel, Ruslan Salakhutdinov, R. Urtasun, Antonio Torralba, Sanja Fidler - 2015
18 papers in library cite
R. Kiros, Yuxuan Zhu, Ruslan Salakhutdinov, Richard S. Zemel, R. Urtasun, Antonio Torralba, Sanja Fidler - 2015
23 papers in library cite
K. Gregor, Ivo Danihelka, Alex Graves, D. J. Rezende, Daan Wierstra - 2015
5 papers in library cite
Richard S. Zemel - 2014
5 papers in library cite
Geoffrey E. Hinton, L. Deng, D. Yu, George E. Dahl, A. Mohamed, Navdeep Jaitly, A. Senior, Vincent Vanhoucke, P. Nguyen, T. N. Sainath, Brian Kingsbury - 2012
8 papers in library cite
Dario Amodei, S. Ananthanarayanan, R. Anubhai, Jinze Bai, E. Battenberg, C. Case, J. Casper, Bryan Catanzaro, Q. Cheng, Guanduo Chen - 2016
3 papers in library cite
T. Salimans, D. A. Kingma, D. P. Diederik - 2016
4 papers in library cite
Lisa Wang, Yiwei Li, Svetlana Lazebnik - 2016
1 paper in library cites
I. Vendrov, R. Kiros, Sanja Fidler, R. Urtasun - 2016
4 papers in library cite
T. Cooijmans, Nicolas Ballas, C. Laurent, Aaron Courville - 2016
3 papers in library cite
C. Laurent, G. Pereyra, P. Brakel, Y. Z. Zhang, Yoshua Bengio - 2015
1 paper in library cites
Bo Pang, L. Lee - 2005
13 papers in library cite
Bo Pang, L. A. Lee, L. Lillian - 2004
8 papers in library cite
J. Wiebe, T. Wilson, T. Theresa, C. A. Cardie, C. Claire - 2005
7 papers in library cite
Marco Marelli, L. Bentivogli, M. Baroni, R. Bernardi, S. Menini, R. Zamparelli - 2014
7 papers in library cite
M. Hu, B. A. Liu, B. Bing - 2004
6 papers in library cite
S. I. Amari - 1998
6 papers in library cite
Hugo Larochelle, I. Murray - 2011
5 papers in library cite
M. Liwicki, H. Bunke - 2005
3 papers in library cite
T. T. D. Team, R. A. Rfou, G. Alain, Amjad Almahairi, C. Angermueller, D. Bahdanau, Nicolas Ballas, F. Bastien, J. Bayer, A. Belikov - 2016
2 papers in library cite
Behnam Neyshabur, Ruslan Salakhutdinov, N. Srebro - 2015
1 paper in library cites
Cited by
14
papers in your library
Cites
22
papers in your library
Read
on July 20, 2025
Your review
Tags
Paper Aliases