2014

Neural Machine Translation by Jointly Learning to Align and Translate

D. Bahdanau, Kyunghyun Cho, Yoshua Bengio

citations

Cite Score

96

AI summary

This paper introduces an extension to the encoder-decoder model which learns to align and translate jointly, achieving a translation performance comparable to the existing phrase-based system on the task of English-to-French translation. It demonstrates qualitatively that the alignments found by the model agree well with intuition.

Main Contributions

  • Proposes an extension to the encoder-decoder model that learns to align and translate jointly.
  • Introduces a new architecture consisting of a bidirectional RNN as an encoder and a decoder that emulates searching through a source sentence during decoding a translation.
  • Achieves a translation performance comparable to the existing state-of-the-art phrase-based system on the task of English-to-French translation.
  • Demonstrates that the alignments found by the model agree well with intuition.
  • Shows that the proposed RNNsearch outperforms the conventional encoder-decoder model (RNNencdec) significantly.

Abstract

Neural machine translation is a recently proposed approach to machine translation. Unlike the traditional statistical machine translation, the neural machine translation aims at building a single neural network that can be jointly tuned to maximize the translation performance. The models proposed recently for neural machine translation often belong to a family of encoder-decoders and encode a source sentence into a fixed-length vector from which a decoder generates a translation. In this paper, we conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and propose to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly. With this new approach, we achieve a translation performance comparable to the existing state-of-the-art phrase-based system on the task of English-to-French translation. Furthermore, qualitative analysis reveals that the (soft-)alignments found by the model agree well with our intuition.

Citation Graph

Loading graph...

References [28]

Sort:
Filter:

Sepp Hochreiter, Jürgen Schmidhuber - 1997

94 papers in library cite

Kyunghyun Cho, B. V. Merrienboer, C. G. Gulcehre, D. Bahdanau, F. Bougares, Holger Schwenk, Yoshua Bengio - 2014

38 papers in library cite

Ilya Sutskever, Oriol Vinyals, Quoc V. Le - 2014

58 papers in library cite

Yoshua Bengio, Patrice Simard, Paolo Frasconi - 1994

31 papers in library cite

Yoshua Bengio, R. Ducharme, Pascal Vincent - 2001

62 papers in library cite

M. Schuster, Kuldip K. Paliwal - 1997

10 papers in library cite

Kyunghyun Cho, B. V. Merrienboer, D. Bahdanau, Yoshua Bengio - 2014

9 papers in library cite

Matthew D. Zeiler - 2012

13 papers in library cite

Razvan Pascanu, Tomas Mikolov, Yoshua Bengio - 2013

21 papers in library cite

Alex Graves - 2013

27 papers in library cite

Yoshua Bengio - 2013

17 papers in library cite

Alex Graves, Navdeep Jaitly, Abdel Rahman Mohamed - 2013

2 papers in library cite

Alex Graves - 2012

7 papers in library cite

N. Kalchbrenner, Phil Blunsom - 2013

27 papers in library cite

F. Bastien, P. Lamblin, Razvan Pascanu, James Bergstra, I. Goodfellow, A. Bergeron, A. Bouchard, N. Nicolas, Yoshua Bengio - 2012

13 papers in library cite

Razvan Pascanu, C. G. Gulcehre, Kyunghyun Cho, Yoshua Bengio - 2013

7 papers in library cite

Jacob Devlin, Rabih Zbib, Zhongqiang Huang, Thomas Lamar, Richard Schwartz, John Makhoul - 2014

9 papers in library cite

Pascal Vincent - 2013

2 papers in library cite

Holger Schwenk - 2012

5 papers in library cite

Holger Schwenk, D. Dchelotte, Jean Luc Gauvain - 2006

5 papers in library cite

James Bergstra, O. Breuleux, F. Bastien, P. Lamblin, Razvan Pascanu, G. Desjardins, J. Turian, D. W. Farley, Yoshua Bengio - 2010

22 papers in library cite

K. M. Hermann, Phil Blunsom - 2014

3 papers in library cite

Sepp Hochreiter - 1991

18 papers in library cite

P. Koehn, F. J. Och, D. Marcu - 2003

8 papers in library cite

A. Axelrod, X. Fe, Jianfeng Gao - 2011

5 papers in library cite

P. Koehn - 2010

5 papers in library cite

J. P. Abadie, D. Bahdanau, B. V. Merrienboer, Kyunghyun Cho, Yoshua Bengio - 2014

2 papers in library cite

M. L. Forcada, R. P. Neco - 1997

2 papers in library cite

Cited by

59

papers in your library

Cites

22

papers in your library

Read

on June 28, 2025

Your review

Tags

Paper Aliases

No aliases