2010

LSTM neural Networks for Language Modeling

M. Sundermeyer, R. Schluter, Hermann Ney

citations

Cite Score

63

AI summary

This paper analyzes Long Short-Term Memory (LSTM) neural networks for language modeling on English and French tasks, demonstrating an 8% relative improvement in perplexity over standard recurrent neural networks and considerable word error rate improvements on a state-of-the-art speech recognition system.

Main Contributions

  • Introduced LSTMs to the field of language modeling.
  • Analyzed the effectiveness of LSTMs on an English and a large French corpus in terms of perplexity and word error rate.
  • Investigated techniques for decreased training times of LSTMs.
  • Compared different neural network LM architectures.
  • Demonstrated that LSTM networks can be combined with existing clustering techniques to gain large speed ups in training and testing times at a small loss in performance.

Abstract

Neural networks have become increasingly popular for the task of language modeling. Whereas feed-forward networks only exploit a fixed context length to predict the next word of a sequence, conceptually, standard recurrent neural networks can take into account all of the predecessor words. On the other hand, it is well known that recurrent networks are difficult to train and therefore are unlikely to show the full potential of recurrent models. These problems are addressed by a the Long Short-Term Memory neural network architecture. In this work, we analyze this type of network on an English and a large French language modeling task. Experiments show improvements of about 8% relative in perplexity over standard recurrent neural network LMs. In addition, we gain considerable improvements in WER on top of a state-of-the-art speech recognition system.

Citation Graph

Loading graph...

References [19]

Sort:
Filter:

Sepp Hochreiter, Jürgen Schmidhuber - 1997

94 papers in library cite

D. E. Rumelhart, Geoffrey E. Hinton, Ronald J. Williams - 1986

34 papers in library cite

Jeffrey L. Elman - 1990

23 papers in library cite

Yoshua Bengio, Patrice Simard, Paolo Frasconi - 1994

31 papers in library cite

Yoshua Bengio, R. Ducharme, Pascal Vincent - 2001

62 papers in library cite

Felix A. Gers, Jürgen Schmidhuber, Fred Cummins - 2000

13 papers in library cite

Tomas Mikolov, M. Karafiat, Lukas Burget, Jan Cernocky, Sanjeev Khudanpur - 2010

36 papers in library cite

R. Kneser, Hermann Ney - 1995

11 papers in library cite

Tomas Mikolov, S. Kombrink, Lukas Burget, Jan Cernocky, Sanjeev Khudanpur - 2011

16 papers in library cite

F. Morin, Yoshua Bengio - 2005

19 papers in library cite

James Martens, Ilya Sutskever - 2011

13 papers in library cite

Holger Schwenk - 2007

12 papers in library cite

Alex Graves, Jürgen Schmidhuber - 2005

14 papers in library cite

C. M. Bishop - 1995

12 papers in library cite

F. Gers, N. Schraudolph, Jürgen Schmidhuber - 2002

9 papers in library cite

J. T. Goodman - 2001

7 papers in library cite

Tomas Mikolov, S. Kombrink, A. Deoras, Lukas Burget, Jan Cernocky - 2011

2 papers in library cite

H. S. Le, A. Allauzen, G. Wisniewski, F. Yvon - 2010

2 papers in library cite

I. Oparin, M. Sundermeyer, Hermann Ney, Jean Luc Gauvain - 2012

1 paper in library cites

Cited by

7

papers in your library

Cites

14

papers in your library

Read

on November 21, 2025

Your review

Tags

Paper Aliases

No aliases