Cite Score
86
AI summary
This paper introduces a simple warm restart technique, SGDR, for stochastic gradient descent to improve its anytime performance when training deep neural networks, achieving state-of-the-art results on the CIFAR-10 and CIFAR-100 datasets.
Main Contributions
Abstract
Restart techniques are common in gradient-free optimization to deal with multi-modal functions. Partial warm restarts are also gaining popularity in gradient-based optimization to improve the rate of convergence in accelerated gradient schemes to deal with ill-conditioned functions. In this paper, we propose a simple warm restart technique for stochastic gradient descent to improve its anytime performance when training deep neural networks. We empirically study its performance on the CIFAR-10 and CIFAR-100 datasets, where we demonstrate new state-of-the-art results at 3.14% and 16.21%, respectively. We also demonstrate its advantages on a dataset of EEG recordings and on a downsampled version of the ImageNet dataset. Our source code is available at https://github.com/loshchil/SGDR
Citation Graph
References [38]
K. He, X. Zhang, S. Ren, Jian Sun - 2016
20 papers in library cite
D. P. Kingma, Jimmy Lei Ba - 2014
49 papers in library cite
Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton - 2012
71 papers in library cite
G. Huang, Ze Liu, K. Weinberger, Laurens Van Der Maaten - 2017
5 papers in library cite
Alex Krizhevsky - 2009
27 papers in library cite
K. He, X. Zhang, S. Ren, Jian Sun - 2016
4 papers in library cite
Frank Hutter - 2017
4 papers in library cite
Matthew D. Zeiler - 2012
13 papers in library cite
J. Donahue, Y. Jia, Oriol Vinyals, J. Hoffman, N. Zhang, E. Tzeng, Trevor Darrell - 2014
15 papers in library cite
S. Han, H. Mao, W. J. Dally - 2015
3 papers in library cite
S. Zagoruyko, N. Komodakis - 2016
5 papers in library cite
L. S. Smith - 2016
3 papers in library cite
G. Huang, Y. S. Sun, Ze Liu, D. Sedra, K. Q. Weinberger - 2016
3 papers in library cite
Yann N. Dauphin, Razvan Pascanu, C. G. Gulcehre, Kyunghyun Cho, Surya Ganguli, Yoshua Bengio - 2014
4 papers in library cite
A. Choromanska, M. Henaff, M. Mathieu, G. B. Arous, Yann Lecun - 2015
4 papers in library cite
Y. Nesterov - 1983
3 papers in library cite
D. Han, Jeremy Kim, Jeremy Kim - 2017
3 papers in library cite
G. Huang, Yiwei Li, G. Pleiss, Ze Liu, J. Hopcroft, K. Weinberger - 2016
3 papers in library cite
Y. Nesterov - 2013
2 papers in library cite
D. C. Liu, J. Nocedal - 1989
2 papers in library cite
Yann N. Dauphin, H. D. Vries, J. Chung, Yoshua Bengio - 2015
2 papers in library cite
Antoine Bordes, Leon Bottou, P. Gallinari - 2009
2 papers in library cite
B. O'donoghue, Emmanuel Candès - 2012
1 paper in library cites
I. Loshchilov, M. Schoenauer, M. Sebag - 2012
1 paper in library cites
N. Hansen - 2009
1 paper in library cites
R. Ros - 2009
1 paper in library cites
R. Schirrmeister, J. Springenberg, L. Fiederer, M. Glasstetter, K. Eggensperger, M. Tangermann, Frank Hutter, W. Burgard, T. Ball - 2017
1 paper in library cites
N. Hansen, S. Kern - 2004
1 paper in library cites
R. Fletcher, C. Reeves - 1964
1 paper in library cites
K. Fukumizu, S. Amari - 2000
1 paper in library cites
L. Deng, Geoffrey Hinton, Brian Kingsbury - 2013
1 paper in library cites
M. Preuss - 2015
1 paper in library cites
M. Preuss - 2010
1 paper in library cites
L. S. Smith - 2015
1 paper in library cites
K. Zhang, Maosong Sun, T. Han, X. Yuan, L. Guo, T. Liu - 2016
1 paper in library cites
M. Powell - 1977
1 paper in library cites
T. Yang, Q. Lin - 2015
1 paper in library cites
H. Pouransari, S. Ghili - 2015
1 paper in library cites
Cited by
4
papers in your library
Cites
13
papers in your library
Read
on November 28, 2025
Your review
Tags
Paper Aliases
No aliases