2014
Cite Score
94
AI summary
This paper introduces a sequence-to-sequence learning approach using multilayered LSTMs for machine translation, achieving a BLEU score of 34.8 on the WMT'14 English to French translation task. Reversing the order of words in source sentences improves performance, and the LSTM model also learns meaningful sentence representations.
Main Contributions
Abstract
Deep Neural Networks (DNNs) are powerful models that have achieved excellent performance on difficult learning tasks. Although DNNs work well whenever large labeled training sets are available, they cannot be used to map sequences to sequences. In this paper, we present a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure. Our method uses a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector. Our main result is that on an English to French translation task from the WMT'14 dataset, the translations produced by the LSTM achieve a BLEU score of 34.8 on the entire test set, where the LSTM's BLEU score was penalized on out-of-vocabulary words. Additionally, the LSTM did not have difficulty on long sentences. For comparison, a phrase-based SMT system achieves a BLEU score of 33.3 on the same dataset. When we used the LSTM to rerank the 1000 hypotheses produced by the aforementioned SMT system, its BLEU score increases to 36.5, which is close to the previous best result on this task. The LSTM also learned sensible phrase and sentence representations that are sensitive to word order and are relatively invariant to the active and the passive voice. Finally, we found that reversing the order of the words in all source sentences (but not target sentences) improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.
Citation Graph
References [31]
Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton - 2012
71 papers in library cite
Sepp Hochreiter, Jürgen Schmidhuber - 1997
94 papers in library cite
Yann Lecun, Leon Bottou, Yoshua Bengio, Patrick Haffner - 1998
62 papers in library cite
D. E. Rumelhart, Geoffrey E. Hinton, Ronald J. Williams - 1986
34 papers in library cite
D. Bahdanau, Kyunghyun Cho, Yoshua Bengio - 2014
59 papers in library cite
Kyunghyun Cho, B. V. Merrienboer, C. G. Gulcehre, D. Bahdanau, F. Bougares, Holger Schwenk, Yoshua Bengio - 2014
38 papers in library cite
K. Papineni, S. Roukos, T. Ward, Wei Jing Zhu - 2002
19 papers in library cite
Geoffrey Hinton - 2012
21 papers in library cite
Yoshua Bengio, Patrice Simard, Paolo Frasconi - 1994
31 papers in library cite
Yoshua Bengio, R. Ducharme, Pascal Vincent - 2001
62 papers in library cite
Razvan Pascanu, Tomas Mikolov, Yoshua Bengio - 2013
21 papers in library cite
Tomas Mikolov, M. Karafiat, Lukas Burget, Jan Cernocky, Sanjeev Khudanpur - 2010
36 papers in library cite
Alex Graves, Santiago Fernandez, Faustino Gomez, Jürgen Schmidhuber - 2006
7 papers in library cite
P. Werbos - 1990
9 papers in library cite
Dan C. Ciresan, Ueli Meier, Jürgen Schmidhuber - 2012
11 papers in library cite
Alex Graves - 2013
27 papers in library cite
G. Dahl, D. Yu, L. Deng, Alex Acero - 2012
19 papers in library cite
Quoc V. Le, M. A. Ranzato, R. Monga, M. Devin, K. Chen, G. S. Corrado, Jeffrey Dean, Andrew Y. Ng - 2012
10 papers in library cite
M. Sundermeyer, R. Schluter, Hermann Ney - 2010
7 papers in library cite
Sepp Hochreiter, Yoshua Bengio, Paolo Frasconi, Jürgen Schmidhuber - 2001
16 papers in library cite
N. Kalchbrenner, Phil Blunsom - 2013
27 papers in library cite
Sepp Hochreiter, Jürgen Schmidhuber - 1997
5 papers in library cite
Jacob Devlin, Rabih Zbib, Zhongqiang Huang, Thomas Lamar, Richard Schwartz, John Makhoul - 2014
9 papers in library cite
K. M. Hermann, Phil Blunsom - 2014
3 papers in library cite
Sepp Hochreiter - 1991
18 papers in library cite
Tomas Mikolov - 2012
17 papers in library cite
N. Durrani, B. Haddow, P. Koehn, K. Heafield - 2014
6 papers in library cite
Michael Auli, M. Galley, C. Quirk, Geoffrey Zweig - 2013
3 papers in library cite
J. P. Abadie, D. Bahdanau, B. V. Merrienboer, Kyunghyun Cho, Yoshua Bengio - 2014
2 papers in library cite
Holger Schwenk - 2014
2 papers in library cite
A. Razborov - 1992
1 paper in library cites
Cited by
58
papers in your library
Cites
24
papers in your library
Read
on June 20, 2025
Your review
Tags
Paper Aliases
No aliases