2014

Dropout: A Simple Way to Prevent Neural Networks From Overfitting

N. Srivastava, Geoffrey E. Hinton, Alex Krizhevsky, Ilya Sutskever, Ruslan R. Salakhutdinov

citations

Cite Score

98

AI summary

This paper introduces dropout, a regularization technique for neural networks that randomly drops units during training to prevent co-adaptation. This reduces overfitting and improves generalization performance, achieving state-of-the-art results on various benchmark datasets for vision, speech recognition, document classification, and computational biology.

Main Contributions

  • Introduces dropout, a novel regularization technique for neural networks that prevents overfitting by randomly dropping units during training.
  • Demonstrates that dropout can be interpreted as a form of model averaging over an exponential number of thinned networks.
  • Shows that dropout improves the performance of neural networks on a variety of supervised learning tasks, including vision, speech recognition, document classification, and computational biology.
  • Achieves state-of-the-art results on many benchmark datasets using dropout.
  • Introduces the dropout Restricted Boltzmann Machine (RBM) model and demonstrates its improved performance compared to standard RBMs.

Abstract

Deep neural nets with a large number of parameters are very powerful machine learning systems. However, overfitting is a serious problem in such networks. Large networks are also slow to use, making it difficult to deal with overfitting by combining the predictions of many different large neural nets at test time. Dropout is a technique for addressing this problem. The key idea is to randomly drop units (along with their connections) from the neural network during training. This prevents units from co-adapting too much. During training, dropout samples from an exponential number of different "thinned" networks. At test time, it is easy to approximate the effect of averaging the predictions of all these thinned networks by simply using a single unthinned network that has smaller weights. This significantly reduces overfitting and gives major improvements over other regularization methods. We show that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.

Citation Graph

Loading graph...

References [36]

Sort:
Filter:

Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton - 2012

71 papers in library cite

Alex Krizhevsky - 2009

27 papers in library cite

Geoffrey Hinton, Ruslan Salakhutdinov - 2006

37 papers in library cite

Geoffrey E. Hinton, S. Osindero, Y. Teh - 2006

43 papers in library cite

Yann Lecun, B. Boser, John S. Denker, D. Henderson, R. E. Howard, W. Hubbard, L. D. Jackal - 1989

24 papers in library cite

Pascal Vincent, Hugo Larochelle, Yoshua Bengio, Pierre Antoine Manzagol - 2008

25 papers in library cite

P. H. Vincent, Larochelle, Isabelle Lajoie, Yoshua Bengio, Pierre Antoine Manzagol - 2010

6 papers in library cite

Y. Netzer, Tianle Wang, A. Coates, Alessandro Bissacco, Bo Wu, Andrew Y. Ng - 2011

8 papers in library cite

John C. Platt - 2003

12 papers in library cite

K. Jarrett, Koray Kavukcuoglu, Marc'aurelio Ranzato, Yann Lecun - 2009

20 papers in library cite

Yoshua Bengio - 2013

17 papers in library cite

P. Sermanet, S. Chintala, Yann Lecun - 2012

6 papers in library cite

Ruslan Salakhutdinov, Geoffrey E. Hinton - 2009

9 papers in library cite

A. Mohamed, G. Dahl, Geoffrey Hinton - 2012

12 papers in library cite

J. Snoek, Hugo Larochelle, R. P. Adams - 2012

9 papers in library cite

N. Srivastava - 2013

6 papers in library cite

George E. Dahl, Marc'aurelio Ranzato, A. Mohamed, Geoffrey E. Hinton - 2010

6 papers in library cite

V. Mnih - 2009

5 papers in library cite

Matthew D. Zeiler, Rob Fergus - 2013

5 papers in library cite

Shijie Wang, C. Manning - 2013

4 papers in library cite

J. Sanchez, F. Perronnin - 2011

4 papers in library cite

R. Tibshirani - 1996

4 papers in library cite

D. Povey, A. Ghoshal - 2011

4 papers in library cite

R. Neal - 1995

3 papers in library cite

N. Srebro, A. Shraibman - 2005

3 papers in library cite

Mark Chen, Zhiwei Xu, K. Weinberger, F. Sha - 2012

2 papers in library cite

A. Globerson, S. Roweis - 2006

2 papers in library cite

H. Xiong, Y. Barash, B. Frey - 2011

1 paper in library cites

Ruslan Salakhutdinov, A. Mnih - 2008

1 paper in library cites

S. Wager, Shijie Wang, Percy Liang - 2013

1 paper in library cites

Yutong Lin, F. Lv, S. Zhu, Michael Yang, T. Cour, K. Yu, L. Cao, Zhiyuan Li, M. Tsai, Xinyu Zhou, T. Huang, Tong Zhang - 2010

1 paper in library cites

O. Dekel, O. Shamir, L. Xiao - 2010

1 paper in library cites

Laurens Van Der Maaten, Mark Chen, S. Tyree, K. Weinberger - 2013

1 paper in library cites

A. Tikhonov - 1943

1 paper in library cites

A. Livnat, C. Papadimitriou, N. Pippenger, M. Feldman - 2010

1 paper in library cites

S. Nowlan, Geoffrey Hinton - 1992

1 paper in library cites

Cited by

20

papers in your library

Cites

14

papers in your library

Read

on October 13, 2025

Your review

Tags

Paper Aliases

No aliases