2008
Cite Score
14
AI summary
This paper introduces adaptive importance sampling using an n-gram model to track conditional distributions, in order to accelerate the training of neural probabilistic language models. The approach achieved a significant speedup on standard problems.
Main Contributions
Abstract
Previous work on statistical language modeling has shown that it is possible to train a feedforward neural network to approximate probabilities over sequences of words, resulting in significant error reduction when compared to standard baseline models based on n-grams. However, training the neural network model with the maximum-likelihood criterion requires computations proportional to the number of words in the vocabulary. In this paper, we introduce adaptive importance sampling as a way to accelerate training of the model. The idea is to use an adaptive n-gram model to track the conditional distributions produced by the neural network. We show that a very significant speedup can be obtained on standard problems.
Citation Graph
References [32]
Yoshua Bengio, R. Ducharme, Pascal Vincent - 2001
62 papers in library cite
A. L. Berger, S. A. D. Pietra, Vincent J. Della Pietra - 1996
10 papers in library cite
R. Kneser, Hermann Ney - 1995
11 papers in library cite
Geoffrey E. Hinton - 1986
13 papers in library cite
Geoffrey E. Hinton - 1999
4 papers in library cite
Yoshua Bengio, Jean Sebastien Senecal - 2003
11 papers in library cite
Holger Schwenk, Jean Luc Gauvain - 2002
14 papers in library cite
Weixin Xu, Alex Rudnicky - 2000
5 papers in library cite
Holger Schwenk - 2004
6 papers in library cite
P. Xu, A. Emami, Frederick Jelinek - 2003
3 papers in library cite
Geoffrey E. Hinton, T. J. Sejnowski - 1986
9 papers in library cite
J. Goodman - 2001
15 papers in library cite
J. Goodman - 2001
15 papers in library cite
S. Katz - 1987
11 papers in library cite
Frederick Jelinek, R. L. Mercer - 1980
8 papers in library cite
Yee Whye Teh, M. Welling, S. Osindero, Geoffrey E. Hinton - 2003
4 papers in library cite
Manning, Schutze - 1999
4 papers in library cite
J. R. Bellegarda - 1997
2 papers in library cite
S. Riis, A. Krogh - 1996
2 papers in library cite
Yoshua Bengio - 2002
2 papers in library cite
K. J. Jensen, S. Riis - 2000
2 papers in library cite
Kevin J. Lang, Geoffrey E. Hinton - 1988
2 papers in library cite
Wenyi Wang, M. P. Harper - 2002
2 papers in library cite
R. Rosenfeld - 2000
2 papers in library cite
A. Kong - 1992
1 paper in library cites
O. Luis, K. Leslie - 2000
1 paper in library cites
J. Cheng, M. J. Druzdzel - 2000
1 paper in library cites
L. D. Brown - 1986
1 paper in library cites
C. P. Robert, G. Casella - 2000
1 paper in library cites
J. S. Liu - 2001
1 paper in library cites
A. Kong, J. S. Liu, W. H. Wong - 1994
1 paper in library cites
R. Rosenfeld, S. F. Chen, X. Zhu - 2001
1 paper in library cites
Cited by
6
papers in your library
Cites
11
papers in your library
Read
on March 28, 2025
Your review
Tags
Paper Aliases
No aliases