2007

Large Language Models in Machine Translation

Jeffrey Dean

citations

Cite Score

33

AI summary

The paper introduces a distributed infrastructure to train language models up to 2 trillion tokens using a new smoothing method called Stupid Backoff, achieving improvements in machine translation quality as measured by the BLEU score.

Main Contributions

  • Proposes a distributed language model training and deployment infrastructure.
  • Introduces a new smoothing method called Stupid Backoff.
  • Demonstrates that translation quality improves with increasing language model size, even at the largest sizes considered (up to 2 trillion tokens).
  • Achieves a 5-gram language model of up to 300 billion n-grams.
  • Shows Stupid Backoff performs as well as sophisticated methods as the size of the language model increases.

Abstract

This paper reports on the benefits of large-scale statistical language modeling in machine translation. A distributed infrastructure is proposed which we use to train on up to 2 trillion tokens, resulting in language models having up to 300 billion n-grams. It is capable of providing smoothed probabilities for fast, single-pass decoding. We introduce a new smoothing method, dubbed Stupid Backoff, that is inexpensive to train on large data sets and approaches the quality of Kneser-Ney Smoothing as the amount of training data increases.

Citation Graph

Loading graph...

References [14]

Sort:
Filter:

K. Papineni, S. Roukos, T. Ward, Wei Jing Zhu - 2002

19 papers in library cite

Jeffrey Dean, Sanjay Ghemawat - 2004

4 papers in library cite

R. Kneser, Hermann Ney - 1995

11 papers in library cite

J. Goodman - 2001

15 papers in library cite

S. F. Chen, J. Goodman - 1998

13 papers in library cite

Frederick Jelinek, R. L. Mercer - 1980

8 papers in library cite

P. F. Brown, S. D. Pietra, Vincent J. Della Pietra, R. L. Mercer - 1993

7 papers in library cite

P. Koehn - 2004

2 papers in library cite

E. W. Noreen - 1989

1 paper in library cites

Y. Z. Zhang, A. S. Hildebrand, S. Vogel - 2006

1 paper in library cites

Hermann Ney, S. Ortmanns - 1999

1 paper in library cites

A. Emami, K. Papineni, J. S. Sorensen - 2007

1 paper in library cites

F. J. Och, Hermann Ney - 2004

1 paper in library cites

Cited by

3

papers in your library

Cites

3

papers in your library

Read

on March 24, 2025

Your review

Tags

Paper Aliases

No aliases