2005
Cite Score
44
AI summary
This paper introduces a hierarchical decomposition of conditional probabilities in neural network language models, achieving a speed-up of about 200 during training and recognition. The hierarchical decomposition is a binary hierarchical clustering constrained by prior knowledge extracted from the WordNet semantic hierarchy.
Main Contributions
Abstract
In recent years, variants of a neural network architecture for statistical language modeling have been proposed and successfully applied, e.g. in the language modeling component of speech recognizers. The main advantage of these architectures is that they learn an embedding for words (or other symbols) in a continuous space that helps to smooth the language model and provide good generalization even when the number of training examples is insufficient. However, these models are extremely slow in comparison to the more commonly used n-gram models, both for training and recognition. As an alternative to an importance sampling method proposed to speed-up training, we introduce a hierarchical decomposition of the conditional probabilities that yields a speed-up of about 200 both during training and recognition. The hierarchical decomposition is a binary hierarchical clustering constrained by the prior knowledge extracted from the WordNet semantic hierarchy.
Citation Graph
References [26]
Jeffrey L. Elman - 1990
23 papers in library cite
Yoshua Bengio, R. Ducharme, Pascal Vincent - 2001
62 papers in library cite
Geoffrey Hinton - 2002
23 papers in library cite
Andreas Stolcke - 2002
13 papers in library cite
A. L. Berger, S. A. D. Pietra, Vincent J. Della Pietra - 1996
10 papers in library cite
Geoffrey E. Hinton - 1986
13 papers in library cite
Yoshua Bengio, Jean Sebastien Senecal - 2003
11 papers in library cite
Holger Schwenk, Jean Luc Gauvain - 2002
14 papers in library cite
Weixin Xu, Alex Rudnicky - 2000
5 papers in library cite
Jürgen Schmidhuber - 1996
3 papers in library cite
Holger Schwenk - 2004
6 papers in library cite
P. Xu, A. Emami, Frederick Jelinek - 2003
3 papers in library cite
C. Fellbaum - 1998
12 papers in library cite
S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, R. Harshman - 1990
12 papers in library cite
G. Salton, C. Buckley - 1988
2 papers in library cite
J. Goodman - 2001
15 papers in library cite
P. F. Brown, P. V. Desouza, R. L. Mercer, Vincent J. Della Pietra, J. C. Lai - 1992
12 papers in library cite
S. Katz - 1987
11 papers in library cite
Frederick Jelinek, R. L. Mercer - 1980
8 papers in library cite
J. T. Goodman - 2001
7 papers in library cite
Fernando Pereira, N. Tishby, L. Lee - 1993
4 papers in library cite
R. Miikkulainen, M. G. Dyer - 1991
4 papers in library cite
Hinrich Schutze - 1993
3 papers in library cite
T. R. Niesler, E. W. D. Whittaker, P. C. Woodland - 1998
2 papers in library cite
D. Baker, Andrew Mccallum - 1998
2 papers in library cite
Hermann Ney, R. Kneser - 1993
2 papers in library cite
Cited by
19
papers in your library
Cites
15
papers in your library
Read
on March 18, 2025
Your review
Tags
Paper Aliases
No aliases