Papperoni

1995

The Wake-Sleep Algorithm for Unsupervised Neural Networks

Geoffrey Hinton, Peter Dayan, B. Frey, R. Neal

citations

Cite Score

AI summary

This paper introduces the wake-sleep algorithm, an unsupervised learning method for multilayer networks using stochastic neurons; it uses “recognition” and “generative” connections to learn economical representations by minimizing a description length objective, demonstrating capabilities on toy problems and handwritten digit recognition.

Main Contributions

Introduces the wake-sleep algorithm for unsupervised learning in multilayer networks.
Uses bottom-up "recognition" and top-down "generative" connections.
Minimizes a description length objective to learn economical representations.
Demonstrates the algorithm's ability to learn generative models for simple toy problems.
Applies the algorithm to handwritten digit recognition, achieving competitive results.

Abstract

An unsupervised learning algorithm for a multilayer network of stochastic neurons is described. Bottom-up “recognition” connections convert the input into representations in successive hidden layers and top-down “generative” connections reconstruct the representation in one layer from the representation in the layer above. In the “wake” phase, neurons are driven by recognition connections, and generative connections are adapted to increase the probability that they would reconstruct the correct activity vector in the layer below. In the "sleep" phase, neurons are driven by generative connections and recognition connections are adapted to increase the probability that they would produce the correct activity vector in the layer above. Supervised learning algorithms for multilayer neural networks face two problems: They require a teacher to specify the desired output of the network and they require some method of communicating error information to all of the connections. The wake-sleep algorithm avoids both these problems. When there is no external teaching signal to be matched, some other goal is required to force the hidden units to extract underlying structure. In the wake-sleep algorithm the goal is to learn representations that are economical to describe but allow the input to be reconstructed accurately. We can quantify this goal by imagining a communication game in which each vector of raw sensory inputs is communicated to a receiver by first sending its hidden representation and then sending the difference between the input vector and its top-down reconstruction from the hidden representation. The aim of learning is to minimize the “description length” which is the total number of bits that would be required to communicate the input vectors in this way [1]. No communication actually takes place, but minimizing the description length that would be required forces the network to learn economical representations that capture the underlying regularities in the data [2].

Citation Graph

Loading graph...

References [9]

Sort:

Filter:

[1]Stochastic Complexity in Statistical Inquiry

J. Rissanen - 1989

4 papers in library cite

Google Scholar

[2]Cognitive Science

M. Jordan, D. Rumelhart - 1992

1 paper in library cites

Google Scholar

[3]Computer Vision, Graphics and Image Processing

G. Carpenter, S. Grossberg - 1987

1 paper in library cites

Google Scholar

[4]In Advances in Neural Information Processing Systems

Geoffrey Hinton, R. Zemel, J. Cowan, Gerald Tesauro, J. Alspector - 1994

1 paper in library cites

Google Scholar

[5]In Large-Scale Theories of the Cortex

S. Ullman, C. Koch, J. Davis - 1994

1 paper in library cites

Google Scholar

Missing year

[6]Lectures in Pattern Theory I, II and III: Pattern Analysis, Pattern Synthesis and Regular Structures

U. Grenander

1 paper in library cites

Google Scholar

[7]Network

M. Kawato, H. Hayakama, T. Inui - 1993

1 paper in library cites

Google Scholar

Missing year

[8]Neural Computation

Peter Dayan, Geoffrey Hinton, R. Neal, R. Zemel

1 paper in library cites

Google Scholar

[9]Trends in Neurosciences

M. Hasselmo, J. Bower - 1993

1 paper in library cites

Google Scholar

Cited by

papers in your library

Cites

papers in your library

Read

on August 8, 2025

It's okay... I get the feeling that this is early autoencoders work, but the term still didn't exist. I don't think it adds nothing new though.