2007

Scaling Learning Algorithms Towards AI

Yoshua Bengio, Yann Lecun

citations

Cite Score

54

AI summary

This paper argues that popular non-parametric learning methods like kernel machines are limited in their ability to learn complex high-dimensional functions due to shallow architectures and local kernels, and proposes deep architectures for greater efficiency in AI-related tasks.

Main Contributions

  • Argues that kernel methods and shallow architectures are fundamentally limited in learning complex high-dimensional functions efficiently.
  • Highlights the inefficiency of shallow architectures in terms of computational elements and examples.
  • Analyzes the limitation of kernel machines with local kernels due to the curse of dimensionality.
  • Proposes deep architectures as a solution to overcome these limitations for complex AI tasks.
  • Presents empirical results on invariant image recognition, comparing kernel methods with deep architectures and showing the latter's efficiency.

Abstract

One long-term goal of machine learning research is to produce methods that are applicable to highly complex tasks, such as perception (vision, audition), reasoning, intelligent control, and other artificially intelligent behaviors. We argue that in order to progress toward this goal, the Machine Learning community must endeavor to discover algorithms that can learn highly complex functions, with minimal need for prior knowledge, and with minimal human intervention. We present mathematical and empirical evidence suggesting that many popular approaches to non-parametric learning, particularly kernel methods, are fundamentally limited in their ability to learn complex high-dimensional functions. Our analysis focuses on two problems. First, kernel machines are shallow architectures, in which one large layer of simple template matchers is followed by a single layer of trainable coefficients. We argue that shallow architectures can be very inefficient in terms of required number of computational elements and examples. Second, we analyze a limitation of kernel machines with a local kernel, linked to the curse of dimensionality, that applies to supervised, unsupervised (manifold learning) and semi-supervised kernel machines. Using empirical results on invariant image recognition tasks, kernel methods are compared with deep architectures, in which lower-level features or concepts are progressively combined into more abstract and higher-level representations. We argue that deep architectures have the potential to generalize in non-local ways, i.e., beyond immediate neighbors, and that this is crucial in order to make progress on the kind of complex tasks required for artificial intelligence.

Citation Graph

Loading graph...

References [49]

Sort:
Filter:

Yann Lecun, Leon Bottou, Yoshua Bengio, Patrick Haffner - 1998

62 papers in library cite

Geoffrey E. Hinton, S. Osindero, Y. Teh - 2006

43 papers in library cite

Yann Lecun, B. Boser, John S. Denker, D. Henderson, R. E. Howard, W. Hubbard, L. D. Jackal - 1989

24 papers in library cite

Yoshua Bengio, P. Lamblin, D. Popovici, Hugo Larochelle - 2006

33 papers in library cite

Geoffrey Hinton - 2002

23 papers in library cite

John C. Platt - 2003

12 papers in library cite

Yann Lecun, Fu Jie Huang, Leon Bottou - 2004

18 papers in library cite

Marc'aurelio Ranzato, C. Poultney, S. Chopra, Yann Lecun - 2006

20 papers in library cite

Gerald Tesauro - 1992

3 papers in library cite

Geoffrey Hinton - 2006

5 papers in library cite

C. Cortes, V. Vapnik - 1995

3 papers in library cite

M. Minsky, S. Papert - 1969

12 papers in library cite

B. E. Boser, I. Guyon, V. N. Vapnik - 1992

3 papers in library cite

V. N. Vapnik - 1998

10 papers in library cite

R. O. Duda, P. E. Hart - 1973

9 papers in library cite

J. Tenenbaum, V. D. Silva, John Langford - 2000

7 papers in library cite

Yoshua Bengio, O. Delalleau, N. L. Roux - 2006

7 papers in library cite

D. Decoste, B. Scholkopf - 2002

6 papers in library cite

F. Huang, Yann Lecun - 2006

5 papers in library cite

P. Utgoff, D. Stracuzzi - 2002

5 papers in library cite

S. T. Roweis, L. K. Saul - 2000

5 papers in library cite

X. Zhu, Zoubin Ghahramani, J. Lafferty - 2003

5 papers in library cite

T. Joachims - 1999

5 papers in library cite

Y. Teh, Geoffrey Hinton - 2001

4 papers in library cite

M. Brand - 2003

3 papers in library cite

J. Hastad - 1987

3 papers in library cite

Denny Zhou, O. Bousquet, T. N. Lal, Jason Weston, B. Scholkopf - 2004

3 papers in library cite

B. Scholkopf, A. Smola, Klaus Robert Muller - 1998

3 papers in library cite

S. Belongie, Jitendra Malik, J. Puzicha - 2002

3 papers in library cite

B. Scholkopf, C. J. C. Burges, A. J. Smola - 1999

2 papers in library cite

E. Allender - 1996

2 papers in library cite

Yoshua Bengio, N. L. Roux, Pascal Vincent, O. Delalleau, P. Marcotte - 2006

2 papers in library cite

O. Delalleau, Yoshua Bengio, N. L. Roux - 2005

2 papers in library cite

C. Williams, C. Rasmussen - 1996

2 papers in library cite

Yoshua Bengio, O. Delalleau, N. L. Roux, J. F. Paiement, Pascal Vincent, M. Ouimet - 2004

2 papers in library cite

M. Jordan - 1998

2 papers in library cite

Yoshua Bengio, M. Monperrus - 2005

2 papers in library cite

M. Belkin, I. Matveeva, P. Niyogi - 2004

2 papers in library cite

Y. Weiss - 1999

2 papers in library cite

David H. Wolpert - 1996

2 papers in library cite

M. Belkin, P. Niyogi - 2003

2 papers in library cite

R. R. Snapp, S. S. Venkatesh - 1998

1 paper in library cites

N. Linial, Y. Mansour, N. Nisan - 1993

1 paper in library cites

M. Schmitt - 2002

1 paper in library cites

Antoine Bordes, S. Ertekin, Jason Weston, Leon Bottou - 2005

1 paper in library cites

Yann Lecun, John S. Denker - 1992

1 paper in library cites

W. Hardle, S. Sperlich, M. Muller, A. Werwatz - 2004

1 paper in library cites

Yoshua Bengio, O. Delalleau, N. L. Roux - 2005

1 paper in library cites

M. Ajtai - 1983

1 paper in library cites

Cited by

15

papers in your library

Cites

14

papers in your library

Read

on February 1, 2026

Your review

Tags

Paper Aliases

No aliases