2011

Deep Sparse Rectifier neural Networks

Xavier Glorot, Antoine Bordes, Yoshua Bengio

citations

Cite Score

87

AI summary

This paper introduces deep rectifier neural networks, utilizing a linear by part activation function, max(0,x). It achieves comparable or superior performance to hyperbolic tangent networks and creates sparse representations suitable for sparse data. Experiments on image and text data show improved training and performance without unsupervised pre-training.

Main Contributions

  • Introduces deep rectifier networks with the max(0,x) activation function.
  • Demonstrates that rectifier networks achieve comparable or superior performance to hyperbolic tangent networks.
  • Shows that rectifier networks create sparse representations with true zeros, suitable for sparse data.
  • Finds that rectifier networks can achieve best performance without unsupervised pre-training on purely supervised tasks with large labeled datasets.
  • Presents empirical results on image recognition and sentiment analysis tasks, highlighting the effectiveness of rectifier networks.

Abstract

While logistic sigmoid neurons are more biologically plausible than hyperbolic tangent neurons, the latter work better for training multi-layer neural networks. This paper shows that rectifying neurons are an even better model of biological neurons and yield equal or better performance than hyperbolic tangent networks in spite of the hard non-linearity and non-differentiability at zero, creating sparse representations with true zeros, which seem remarkably suitable for naturally sparse data. Even though they can take advantage of semi-supervised setups with extra-unlabeled data, deep rectifier networks can reach their best performance without requiring any unsupervised pre-training on purely supervised tasks with large labeled datasets. Hence, these results can be seen as a new milestone in the attempts at understanding the difficulty in training deep but purely supervised neural networks, and closing the performance gap between neural networks learnt with and without unsupervised pre-training.

Citation Graph

Loading graph...

References [36]

Sort:
Filter:

Yann Lecun, Leon Bottou, Yoshua Bengio, Patrick Haffner - 1998

62 papers in library cite

Alex Krizhevsky - 2009

27 papers in library cite

Yoshua Bengio - 2010

20 papers in library cite

V. Nair, Geoffrey E. Hinton - 2010

18 papers in library cite

Geoffrey E. Hinton, S. Osindero, Y. Teh - 2006

43 papers in library cite

Yoshua Bengio - 2009

25 papers in library cite

Pascal Vincent, Hugo Larochelle, Yoshua Bengio, Pierre Antoine Manzagol - 2008

25 papers in library cite

Yann Lecun, Leon Bottou, G. B. Orr, Klaus Robert Muller - 1998

20 papers in library cite

Yoshua Bengio, P. Lamblin, D. Popovici, Hugo Larochelle - 2006

33 papers in library cite

Dumitru Erhan, Yoshua Bengio, Aaron Courville, Pierre Antoine Manzagol, Pascal Vincent, Samy Bengio - 2010

12 papers in library cite

K. Jarrett, Koray Kavukcuoglu, Marc'aurelio Ranzato, Yann Lecun - 2009

20 papers in library cite

John Blitzer, Mark Dredze, Fernando Pereira - 2007

4 papers in library cite

Yann Lecun, Fu Jie Huang, Leon Bottou - 2004

18 papers in library cite

Marc'aurelio Ranzato, C. Poultney, S. Chopra, Yann Lecun - 2006

20 papers in library cite

I. Goodfellow, Quoc Le, A. Saxe, A. Ng - 2009

7 papers in library cite

Marc'aurelio Ranzato, Y. Boureau, Yann Lecun - 2008

12 papers in library cite

E. Doi, D. C. Balcan, M. S. Lewicki - 2006

5 papers in library cite

Bruno A. Olshausen, David J. Field - 1997

10 papers in library cite

Honglak Lee, C. Ekanadham, A. Ng - 2008

10 papers in library cite

Honglak Lee, Alexis Battle, Rajat Raina, A. Ng - 2007

6 papers in library cite

B. Snyder, R. Barzilay - 2007

5 papers in library cite

Bo Pang, L. Lee - 2008

4 papers in library cite

E. Salinas, L. F. Abbott - 1996

2 papers in library cite

T. Serre, G. Kreiman, M. Kouh, C. Cadieu, U. Knoblich, T. Poggio - 2007

2 papers in library cite

R. H. R. Hahnloser - 1998

2 papers in library cite

Julien Mairal, F. Bach, Jean Ponce, G. Sapiro, Andrew Zisserman - 2009

2 papers in library cite

P. Lennie - 2003

2 papers in library cite

Shuyan Zhou, Qinlang Chen, Xinpeng Wang - 2010

1 paper in library cites

D. Attwell, S. Laughlin - 2001

1 paper in library cites

Emmanuel Candès, T. Tao - 2005

1 paper in library cites

Yoshua Bengio, Others - 2010

1 paper in library cites

P. Grother - 1995

1 paper in library cites

C. Dugas, Yoshua Bengio, F. Belisle, C. Nadeau, R. Garcia - 2001

1 paper in library cites

R. Douglas, Others - 2003

1 paper in library cites

P. C. Bush, T. J. Sejnowski - 1995

1 paper in library cites

Peter Dayan, L. Abott - 2001

1 paper in library cites

Cited by

17

papers in your library

Cites

16

papers in your library

Read

on July 1, 2025

Your review

Tags

Paper Aliases

No aliases