1995

The Wake-Sleep Algorithm for Unsupervised Neural Networks

Geoffrey Hinton, Peter Dayan, B. Frey, R. Neal

citations

Cite Score

49

AI summary

This paper introduces the wake-sleep algorithm, an unsupervised learning method for multilayer networks using stochastic neurons; it uses “recognition” and “generative” connections to learn economical representations by minimizing a description length objective, demonstrating capabilities on toy problems and handwritten digit recognition.

Main Contributions

  • Introduces the wake-sleep algorithm for unsupervised learning in multilayer networks.
  • Uses bottom-up "recognition" and top-down "generative" connections.
  • Minimizes a description length objective to learn economical representations.
  • Demonstrates the algorithm's ability to learn generative models for simple toy problems.
  • Applies the algorithm to handwritten digit recognition, achieving competitive results.

Abstract

An unsupervised learning algorithm for a multilayer network of stochastic neurons is described. Bottom-up “recognition” connections convert the input into representations in successive hidden layers and top-down “generative” connections reconstruct the representation in one layer from the representation in the layer above. In the “wake” phase, neurons are driven by recognition connections, and generative connections are adapted to increase the probability that they would reconstruct the correct activity vector in the layer below. In the "sleep" phase, neurons are driven by generative connections and recognition connections are adapted to increase the probability that they would produce the correct activity vector in the layer above. Supervised learning algorithms for multilayer neural networks face two problems: They require a teacher to specify the desired output of the network and they require some method of communicating error information to all of the connections. The wake-sleep algorithm avoids both these problems. When there is no external teaching signal to be matched, some other goal is required to force the hidden units to extract underlying structure. In the wake-sleep algorithm the goal is to learn representations that are economical to describe but allow the input to be reconstructed accurately. We can quantify this goal by imagining a communication game in which each vector of raw sensory inputs is communicated to a receiver by first sending its hidden representation and then sending the difference between the input vector and its top-down reconstruction from the hidden representation. The aim of learning is to minimize the “description length” which is the total number of bits that would be required to communicate the input vectors in this way [1]. No communication actually takes place, but minimizing the description length that would be required forces the network to learn economical representations that capture the underlying regularities in the data [2].

Citation Graph

Loading graph...

References [9]

Sort:
Filter:

J. Rissanen - 1989

4 papers in library cite

M. Jordan, D. Rumelhart - 1992

1 paper in library cites

G. Carpenter, S. Grossberg - 1987

1 paper in library cites

Geoffrey Hinton, R. Zemel, J. Cowan, Gerald Tesauro, J. Alspector - 1994

1 paper in library cites

S. Ullman, C. Koch, J. Davis - 1994

1 paper in library cites

M. Kawato, H. Hayakama, T. Inui - 1993

1 paper in library cites

Missing year

Peter Dayan, Geoffrey Hinton, R. Neal, R. Zemel

1 paper in library cites

M. Hasselmo, J. Bower - 1993

1 paper in library cites

Cited by

9

papers in your library

Cites

0

papers in your library

Read

on August 8, 2025

Your review

Tags

Paper Aliases

No aliases