2003

Quick Training of Probabilistic Neural Nets by Importance Sampling

Yoshua Bengio, Jean Sebastien Senecal

citations

Cite Score

11

AI summary

This paper introduces a novel approach to train probabilistic neural networks for statistical language modeling, utilizing importance sampling to address the curse of dimensionality. The method significantly speeds up training by reducing the number of network passes, achieving comparable performance to full-gradient models on the Brown corpus.

Main Contributions

  • Introduces sampling-based methods inspired by contrastive divergence for training probabilistic neural networks, requiring network passes only for the positive example and a few sampled negative examples.
  • Evaluates sampling-based methods which require network passes only for the observed "positive example" and a few sampled negative example words.
  • Proposes and evaluates an adaptive importance sampling technique that dynamically adjusts the sample size based on a diagnostic, optimizing the trade-off between bias and variance.
  • Achieves a 15-fold speed-up in training time compared to ordinary gradient computation on the Brown corpus.
  • Demonstrates comparable training and test performance to the full-gradient model, surpassing the interpolated trigram test performance.

Abstract

Our previous work on statistical language modeling introduced the use of probabilistic feedforward neural networks to help dealing with the curse of dimensionality. Training this model by maximum likelihood however requires for each example to perform as many network passes as there are words in the vocabulary. Inspired by the contrastive divergence model, we propose and evaluate sampling-based methods which require network passes only for the observed "positive example" and a few sampled negative example words. A very significant speed-up is obtained with an adaptive importance sampling.

Citation Graph

Loading graph...

References [16]

Sort:
Filter:

Yoshua Bengio, R. Ducharme, Pascal Vincent - 2001

62 papers in library cite

Geoffrey Hinton - 2002

23 papers in library cite

A. L. Berger, S. A. D. Pietra, Vincent J. Della Pietra - 1996

10 papers in library cite

Geoffrey E. Hinton - 1986

13 papers in library cite

Frederick Jelinek, R. L. Mercer - 1980

8 papers in library cite

C. Chelba, Frederick Jelinek - 2000

6 papers in library cite

Manning, Schutze - 1999

4 papers in library cite

C. Genest, J. V. Zidek - 1986

3 papers in library cite

E. Charniak - 2000

2 papers in library cite

T. Heskes - 1998

2 papers in library cite

L. Saul, M. Jordan - 1996

2 papers in library cite

Michael Collins - 1999

2 papers in library cite

M. Jordan - 1998

2 papers in library cite

Samy Bengio, Yoshua Bengio - 2000

2 papers in library cite

G. Foster - 2002

1 paper in library cites

Cited by

11

papers in your library

Cites

4

papers in your library

Read

on March 28, 2025

Your review

Tags

Paper Aliases

No aliases