2008

Adaptive Importance Sampling to Accelerate Training of a Neural Probabilistic Language Model

Yoshua Bengio, Jean Sebastien Senecal

citations

Cite Score

14

AI summary

This paper introduces adaptive importance sampling using an n-gram model to track conditional distributions, in order to accelerate the training of neural probabilistic language models. The approach achieved a significant speedup on standard problems.

Main Contributions

  • Introduced adaptive importance sampling to accelerate training of a neural probabilistic language model.
  • Used an adaptive n-gram model to track the conditional distributions produced by the neural network.
  • Achieved a very significant speedup on standard problems.

Abstract

Previous work on statistical language modeling has shown that it is possible to train a feedforward neural network to approximate probabilities over sequences of words, resulting in significant error reduction when compared to standard baseline models based on n-grams. However, training the neural network model with the maximum-likelihood criterion requires computations proportional to the number of words in the vocabulary. In this paper, we introduce adaptive importance sampling as a way to accelerate training of the model. The idea is to use an adaptive n-gram model to track the conditional distributions produced by the neural network. We show that a very significant speedup can be obtained on standard problems.

Citation Graph

Loading graph...

References [32]

Sort:
Filter:

Yoshua Bengio, R. Ducharme, Pascal Vincent - 2001

62 papers in library cite

A. L. Berger, S. A. D. Pietra, Vincent J. Della Pietra - 1996

10 papers in library cite

R. Kneser, Hermann Ney - 1995

11 papers in library cite

Geoffrey E. Hinton - 1986

13 papers in library cite

Geoffrey E. Hinton - 1999

4 papers in library cite

Yoshua Bengio, Jean Sebastien Senecal - 2003

11 papers in library cite

Holger Schwenk, Jean Luc Gauvain - 2002

14 papers in library cite

Weixin Xu, Alex Rudnicky - 2000

5 papers in library cite

Holger Schwenk - 2004

6 papers in library cite

P. Xu, A. Emami, Frederick Jelinek - 2003

3 papers in library cite

Geoffrey E. Hinton, T. J. Sejnowski - 1986

9 papers in library cite

J. Goodman - 2001

15 papers in library cite

J. Goodman - 2001

15 papers in library cite

Frederick Jelinek, R. L. Mercer - 1980

8 papers in library cite

Yee Whye Teh, M. Welling, S. Osindero, Geoffrey E. Hinton - 2003

4 papers in library cite

Manning, Schutze - 1999

4 papers in library cite

J. R. Bellegarda - 1997

2 papers in library cite

Yoshua Bengio - 2002

2 papers in library cite

K. J. Jensen, S. Riis - 2000

2 papers in library cite

Kevin J. Lang, Geoffrey E. Hinton - 1988

2 papers in library cite

R. Rosenfeld - 2000

2 papers in library cite

A. Kong - 1992

1 paper in library cites

O. Luis, K. Leslie - 2000

1 paper in library cites

J. Cheng, M. J. Druzdzel - 2000

1 paper in library cites

L. D. Brown - 1986

1 paper in library cites

C. P. Robert, G. Casella - 2000

1 paper in library cites

J. S. Liu - 2001

1 paper in library cites

A. Kong, J. S. Liu, W. H. Wong - 1994

1 paper in library cites

R. Rosenfeld, S. F. Chen, X. Zhu - 2001

1 paper in library cites

Cited by

6

papers in your library

Cites

11

papers in your library

Read

on March 28, 2025

Your review

Tags

Paper Aliases

No aliases