2014

Fast and Robust Neural Network Joint Models for Statistical Machine Translation

Jacob Devlin, Rabih Zbib, Zhongqiang Huang, Thomas Lamar, Richard Schwartz, John Makhoul

citations

Cite Score

28

AI summary

This paper introduces a novel neural network joint model (NNJM) that augments NNLMs with a source context window, achieving significant BLEU gains on NIST OpenMT12 Arabic-English and Chinese-English conditions, while also speeding up computation by 10,000x.

Main Contributions

  • Introduced a novel formulation for a neural network joint model (NNJM) that augments an n-gram target language model with an m-word source window.
  • Proposed a novel technique for training the neural network to be self-normalized, avoiding costly posteriorization over the entire vocabulary in decoding.
  • Presented techniques that speed up NNJM computation by a factor of 10,000x, making it as fast as a standard back-off LM.
  • Achieved significant empirical results, including a +3.0 BLEU gain on a strong baseline and +6.3 BLEU on a simpler baseline on the NIST OpenMT12 Arabic-English condition.
  • Demonstrated strong improvements on the NIST OpenMT12 Chinese-English task and DARPA BOLT Arabic-English and Chinese-English conditions.

Abstract

Recent work has shown success in using neural network language models (NNLMs) as features in MT systems. Here, we present a novel formulation for a neural network joint model (NNJM), which augments the NNLM with a source context window. Our model is purely lexicalized and can be integrated into any MT decoder. We also present several variations of the NNJM which provide significant additive improvements. Although the model is quite simple, it yields strong empirical results. On the NIST OpenMT12 Arabic-English condition, the NNJM features produce a gain of +3.0 BLEU on top of a powerful, feature-rich baseline which already includes a target-only NNLM. The NNJM features also produce a gain of +6.3 BLEU on top of a simpler baseline equivalent to Chiang’s (2007) original Hiero implementation. Additionally, we describe two novel techniques for overcoming the historically high cost of using NNLM-style models in MT decoding. These techniques speed up NNJM computation by a factor of 10,000x, making the model as fast as a standard back-off LM.

Citation Graph

Loading graph...

References [29]

Sort:
Filter:

Yoshua Bengio, R. Ducharme, Pascal Vincent - 2001

62 papers in library cite

Razvan Pascanu, Tomas Mikolov, Yoshua Bengio - 2013

21 papers in library cite

Tomas Mikolov, M. Karafiat, Lukas Burget, Jan Cernocky, Sanjeev Khudanpur - 2010

36 papers in library cite

Yann Lecun, Leon Bottou, G. B. Orr, Klaus Robert Muller - 1998

20 papers in library cite

Tomas Mikolov, W. T. Yih, Geoffrey Zweig - 2013

8 papers in library cite

R. Kneser, Hermann Ney - 1995

11 papers in library cite

N. Kalchbrenner, Phil Blunsom - 2013

27 papers in library cite

Holger Schwenk - 2012

5 papers in library cite

Holger Schwenk, D. Dchelotte, Jean Luc Gauvain - 2006

5 papers in library cite

Reference title contains 'et al'

D. M. Blei, Andrew Y. Ng, Michael I. Jordan - 2003

10 papers in library cite

W. Zou, Richard Socher, D. Cer, C. Manning - 2013

4 papers in library cite

R. Rosenfeld - 1996

6 papers in library cite

Ashish Vaswani, Y. Zhao, V. Fossum, D. Chiang - 2013

5 papers in library cite

L. H. Son, A. Allauzen, F. Yvon - 2012

4 papers in library cite

F. J. Och, Hermann Ney - 2003

3 papers in library cite

Michael Auli, M. Galley, C. Quirk, Geoffrey Zweig - 2013

3 papers in library cite

D. Chiang - 2007

2 papers in library cite

D. Chiang, K. Knight, Wenyi Wang - 2009

1 paper in library cites

M. Tanaka, Y. Toru, J. Y. Yamamoto, M. Norimatsu - 2013

1 paper in library cites

A. Rosti, B. Zhang, S. Matsoukas, Richard Schwartz - 2010

1 paper in library cites

J. M. Crego, F. Yvon - 2010

1 paper in library cites

Zhongqiang Huang, Jacob Devlin, Rabih Zbib - 2013

1 paper in library cites

J. Riesa, A. Irvine, D. Marcu - 2011

1 paper in library cites

M. Snover, B. Dorr, Richard Schwartz - 2008

1 paper in library cites

Jacob Devlin - 2009

1 paper in library cites

N. Habash, R. Roth, O. Rambow, R. Eskander, N. Tomeh - 2013

1 paper in library cites

J. B. Marino, R. E. Banchs, J. M. Crego, A. D. Gispert, P. Lambert, J. A. Fonollosa, M. R. C. Jussa - 2006

1 paper in library cites

L. Shen, Jiacheng Xu, R. Weischedel - 2010

1 paper in library cites

Jacob Devlin, S. Matsoukas - 2012

1 paper in library cites

Cited by

9

papers in your library

Cites

11

papers in your library

Read

on February 19, 2026

Your review

Tags

Paper Aliases

No aliases