1989

Generalization and Network Design Strategies

Yann Lecun

citations

Cite Score

52

AI summary

This paper introduces network design strategies with weight sharing to improve generalization performance, achieving state-of-the-art results on a handwritten digit recognition problem. It also introduces a new digit dataset, highlighting the benefit of shift invariance.

Main Contributions

  • Demonstrates the importance of incorporating a priori knowledge into network architecture for improved generalization.
  • Introduces weight space transformation (WST) as a generalization of weight sharing to reduce the parameter space.
  • Presents a small handwritten digit recognition problem and shows the effectiveness of constrained networks.
  • Achieves 98.4% generalization performance on the digit recognition task using a network with hierarchical feature extractors.
  • Highlights the trade-off between speed, generality, and generalization in network design.

Abstract

An interesting property of connectionist systems is their ability to learn from examples. Although most recent work in the field concentrates on reducing learning times, the most important feature of a learning machine is its generalization performance. It is usually accepted that good generalization performance on real-world problems cannot be achieved unless some a priori knowledge about the task is built into the system. Back-propagation networks provide a way of specifying such knowledge by imposing constraints both on the architecture of the network and on its weights. In general, such constraints can be considered as particular transformations of the parameter space Building a constrained network for image recognition appears to be a feasible task. We describe a small handwritten digit recognition problem and show that, even though the problem is linearly separable, single layer networks exhibit poor generalization performance. Multilayer constrained networks perform very well on this task when organized in a hierarchical structure with shift invariant feature detectors. These results confirm the idea that minimizing the number of free parameters in the network enhances generalization.

Citation Graph

Loading graph...

References [18]

Sort:
Filter:

D. E. Rumelhart, Geoffrey E. Hinton, Ronald J. Williams - 1986

46 papers in library cite

A. H. Waibel, T. Hanazawa, Geoffrey Hinton, K. Shikano, K. Lang - 1989

13 papers in library cite

S. J. Hanson, Lorien Y. Pratt - 1988

3 papers in library cite

S. Becker, Yann Lecun - 1988

9 papers in library cite

John Denker, Daniel Schwartz, Ben Wittner, Sara A. Solla, Richard Howard, Lawrence Jackel, John Hopfield - 1987

4 papers in library cite

Yann Lecun - 1985

5 papers in library cite

Yann Lecun - 1986

3 papers in library cite

Yann Lecun - 1987

9 papers in library cite

D. B. Parker - 1985

8 papers in library cite

L. Y. Bottou, Yann Lecun - 1988

5 papers in library cite

J. A. E. Bryson, Y. C. Ho - 1969

4 papers in library cite

S. Patarnello, P. Carnevali - 1987

3 papers in library cite

Y. Chauvin - 1989

2 papers in library cite

Yann Lecun - 1988

2 papers in library cite

M. C. Mozer, P. Smolensky - 1989

2 papers in library cite

P. Werbos - 1974

1 paper in library cites

L. Y. Bottou - 1988

1 paper in library cites

W. H. Press, B. P. Flannery, T. S. A., V. W. T. - 1988

1 paper in library cites

Cited by

5

papers in your library

Cites

7

papers in your library

Read

on June 22, 2025

Your review

Tags

Paper Aliases

No aliases