Papperoni

1999

Products of Experts

Geoffrey E. Hinton

citations

Cite Score

AI summary

This paper introduces Products of Experts (PoE), a method to combine multiple probabilistic models by multiplying their probabilities and renormalizing, effectively modeling high-dimensional data with various low-dimensional constraints and producing sharper distributions than individual models. It uses Gibbs sampling to estimate the derivative of the log probability of the data.

Main Contributions

Introduces Products of Experts (PoE), a method for combining multiple probabilistic models.
Demonstrates that PoE can produce much sharper distributions than individual expert models.
Proposes an efficient way to train a product of models using Gibbs sampling.
Shows that PoE can effectively model data distributions that can be factorized into a product of lower dimensional distributions.
Explains why one Gibbs iteration works by showing that the PoE learning algorithm can start from a logarithmic opinion pool of sensible experts.

Abstract

It is possible to combine multiple probabilistic models of the same data by multiplying the probabilities together and then renormalizing. This is a very efficient way to model high-dimensional data which simultaneously satisfies many different low-dimensional constraints. Each individual expert model can focus on giving high probability to data vectors that satisfy just one of the constraints. Data vectors that satisfy this one constraint but violate other constraints will be ruled out by their low probability under the other expert models. Training a product of models appears difficult because, in addition to maximizing the probabilities that the individual models assign to the observed data, it is necessary to make the models disagree on unobserved regions of the data space: It is fine for one model to assign a high probability to an unobserved region as long as some other model assigns it a very low probability. Fortunately, if the individual models are tractable there is a fairly efficient way to train a product of models. This training algorithm suggests a biologically plausible way of learning neural population codes.

Citation Graph

Loading graph...

References [6]

Sort:

Filter:

[1]The Wake-Sleep Algorithm for Unsupervised Neural Networks

Geoffrey Hinton, Peter Dayan, B. Frey, R. Neal - 1995

9 papers in library cite

Google Scholar

It's okay... I get the feeling that this is early autoencoders work, but the term still didn't exist. I don't think it adds nothing new though.

[2]Learning and Relearning in Boltzmann Machines

Geoffrey E. Hinton, T. J. Sejnowski - 1986

9 papers in library cite

Google Scholar

37 pages; Introduced Boltzmann machines

[3]Learning Continuous Attractors in Recurrent Networks

S. H. Seung - 1998

5 papers in library cite

Google Scholar

[4]Combining Probability Distributions: A Critique and an Annotated Bibliography

C. Genest, J. V. Zidek - 1986

3 papers in library cite

Google Scholar

[5]Learning Representations by Recirculation

Geoffrey E. Hinton, J. L. Mcclelland - 1988

3 papers in library cite

Google Scholar

[6]Mean Field Theory for Sigmoid Belief Networks

L. Saul, T. Jaakkola, M. Jordan - 1996

3 papers in library cite

Google Scholar

Cited by

papers in your library

Cites

papers in your library

Read

on June 23, 2025

Very interesting concept. This is what paved the way to RBMs and DBNs in the future.