Papperoni

1995

Improved Backing-Off for M-Gram language Modeling

R. Kneser, Hermann Ney

citations

Cite Score

AI summary

This paper introduces optimized backing-off distributions for M-gram language modeling, derived from theoretical approaches, achieving a 10% improvement in perplexity and a 5% reduction in word error rate on the Verbmobil and Wall-Street-Journal corpora.

Main Contributions

Proposes novel backing-off distributions optimized for language modeling.
Presents two theoretical derivations leading to distinct distributions.
Demonstrates improved perplexity (10%) and word error rate (5%) compared to standard methods.
Evaluates the approach on the Verbmobil and Wall-Street-Journal corpora.
Introduces singleton and marginal constraint distributions for backing-off.

Abstract

In stochastic language modeling, backing-off is a widely used method to cope with the sparse data problem. In case of unseen events this method backs off to a less specific distribution. In this paper we propose to use distributions which are especially optimized for the task of backing-off. Two different theoretical derivations lead to distributions which are quite different from the probability distributions that are usually used for backing off. Experiments show an improvement of about 10% in terms of perplexity and 5% in terms of word error rate.

Citation Graph

Loading graph...

References [7]

Sort:

Filter:

[1]Estimation of Probabilities From Sparse Data for the Language Model Component of a Speech Recognizer

S. Katz - 1987

11 papers in library cite

Google Scholar

[2]Pattern Classification and Scene Analysis

R. O. Duda, P. E. Hart - 1973

9 papers in library cite

Google Scholar

[3]The Population Frequencies of Species and the Estimation of Population Parameters

I. J. Good - 1953

2 papers in library cite

Google Scholar

[4]Large Vocabulary Continuous Speech Recognition of Wall Street Journal Data

X. Aubert, C. Dugast, Hermann Ney, V. Steinbiss - 1994

1 paper in library cites

Google Scholar

[5]On Structuring Probabilistic Dependences in Stochastic Language Modelling

Hermann Ney, U. Essen, R. Kneser - 1994

1 paper in library cites

Google Scholar

[6]Self-Organized Language Modeling for Speech Recognition

Frederick Jelinek - 1991

1 paper in library cites

Google Scholar

[7]The Design for the Wall Street Journal-based CSR Corpus

D. B. Paul, J. M. Baker - 1992

1 paper in library cites

Google Scholar

Cited by

papers in your library

Cites

papers in your library

Read

on June 13, 2025

It's nice, it's simple... But not NNs and seems very incremental on top of existing backoff