2017

Unsupervised Neural Machine Translation

M. Artetxe, G. Labaka, E. Agirre, Kyunghyun Cho

citations

Cite Score

37

AI summary

The paper introduces a novel method to train NMT systems in a completely unsupervised manner. It uses an attentional encoder-decoder model trained on monolingual corpora with denoising and back-translation. The model achieves 15.56 BLEU (Fr-En) and 10.21 (De-En) on WMT 2014.

Main Contributions

  • Introduces a novel method to train NMT systems in a completely unsupervised manner, relying solely on monolingual corpora.
  • Employs a modified attentional encoder-decoder model with a shared encoder and fixed cross-lingual embeddings.
  • Uses a combination of denoising and on-the-fly backtranslation for training.
  • Achieves 15.56 BLEU points in WMT 2014 French → English translation and 10.21 BLEU points in German → English translation using only monolingual data.
  • Demonstrates that the model can benefit from small parallel corpora, achieving 21.81 and 15.24 BLEU points with 100,000 parallel sentences.

Abstract

In spite of the recent success of neural machine translation (NMT) in standard benchmarks, the lack of large parallel corpora poses a major practical problem for many language pairs. There have been several proposals to alleviate this issue with, for instance, triangulation and semi-supervised learning techniques, but they still require a strong cross-lingual signal. In this work, we completely remove the need of parallel data and propose a novel method to train an NMT system in a completely unsupervised manner, relying on nothing but monolingual corpora. Our model builds upon the recent work on unsupervised embedding mappings, and consists of a slightly modified attentional encoder-decoder model that can be trained on monolingual corpora alone using a combination of denoising and backtranslation. Despite the simplicity of the approach, our system obtains 15.56 and 10.21 BLEU points in WMT 2014 French → English and German → English translation. The model can also profit from small parallel corpora, and attains 21.81 and 15.24 points when combined with 100,000 parallel sentences, respectively. Our implementation is released as an open source project.

Citation Graph

Loading graph...

References [36]

Sort:
Filter:

D. P. Kingma, Jimmy Lei Ba - 2014

49 papers in library cite

Tomas Mikolov, Ilya Sutskever, K. Chen, G. S. Corrado, Jeffrey Dean - 2013

32 papers in library cite

D. Bahdanau, Kyunghyun Cho, Yoshua Bengio - 2014

59 papers in library cite

Kyunghyun Cho, B. V. Merrienboer, C. G. Gulcehre, D. Bahdanau, F. Bougares, Holger Schwenk, Yoshua Bengio - 2014

38 papers in library cite

Ilya Sutskever, Oriol Vinyals, Quoc V. Le - 2014

58 papers in library cite

T. Luong, H. Pham, Christopher D. Manning - 2015

15 papers in library cite

P. H. Vincent, Larochelle, Isabelle Lajoie, Yoshua Bengio, Pierre Antoine Manzagol - 2010

6 papers in library cite

R. Sennrich, B. Haddow, Alexandra Birch - 2016

22 papers in library cite

Yonghui Wu, M. Schuster, Ziru Chen, Quoc V. Le, M. Norouzi, W. Macherey, M. Krikun, Yue Cao, Q. Gao, K. Macherey, J. Klingner, A. Shah, M. J. Johnson, Xiaodong Liu, Lukasz Kaiser, S. Gouws, Y. Kato, T. Kudo, H. Kazawa, K. Stevens, G. Kurian, N. Patil, Wenyi Wang, C. Young, J. Smith, J. Riesa, A. Rudnick, Oriol Vinyals, G. S. Corrado, M. Hughes, Jeffrey Dean - 2016

15 papers in library cite

Tomas Mikolov, Quoc V. Le, Ilya Sutskever - 2013

6 papers in library cite

A. M. Dai, Quoc V. Le - 2015

27 papers in library cite

F. Hill, Kyunghyun Cho, Anna Korhonen - 2016

12 papers in library cite

P. Ramachandran, P. J. Liu, Quoc V. Le - 2017

9 papers in library cite

R. Sennrich, B. Haddow, Alexandra Birch - 2016

4 papers in library cite

M. J. Johnson, M. Schuster, Quoc V. Le, M. Krikun, Yonghui Wu, Ziru Chen, N. Thorat, F. B. Viegas, M. Wattenberg, G. S. Corrado, M. Hughes, Jeffrey Dean - 2017

7 papers in library cite

P. Koehn, R. Knowles - 2017

1 paper in library cites

D. He, Y. Xia, T. Qin, Lisa Wang, N. Yu, T. Liu, W. Y. Ma - 2016

2 papers in library cite

O. Firat, Kyunghyun Cho, Yoshua Bengio - 2016

2 papers in library cite

S. L. Smith, D. H. Turban, S. Hamblin, N. Y. Hammerla - 2017

4 papers in library cite

M. Artetxe, G. Labaka, E. Agirre - 2017

2 papers in library cite

M. Artetxe, G. Labaka, E. Agirre - 2016

2 papers in library cite

Mingchuan Zhang, Yibo Liu, H. Luan, Maosong Sun - 2017

2 papers in library cite

A. Lazaridou, G. Dinu, M. Baroni - 2015

2 papers in library cite

Yanru Chen, Yibo Liu, Y. Cheng, V. O. K. Li - 2017

1 paper in library cites

S. Gouws, Yoshua Bengio, G. Corrado - 2015

2 papers in library cite

Noah A. Smith, J. Eisner - 2005

2 papers in library cite

S. Ravi, K. Knight - 2011

2 papers in library cite

Q. Dou, Ashish Vaswani, K. Knight, C. Dyer - 2015

2 papers in library cite

O. Firat, B. Sankaran, Y. A. Onaizan, F. T. Y. Vural, Kyunghyun Cho - 2016

2 papers in library cite

T. Luong, H. Pham, Christopher D. Manning - 2015

1 paper in library cites

A. Currey, A. V. M. Barone, K. Heafield - 2017

1 paper in library cites

Q. Dou, K. Knight - 2013

1 paper in library cites

Jaehoon Lee, Kyunghyun Cho, T. Hofmann - 2017

1 paper in library cites

Q. Dou, K. Knight - 2012

1 paper in library cites

T. L. Ha, J. Niehues, A. Waibel - 2016

1 paper in library cites

Cited by

4

papers in your library

Cites

24

papers in your library

Read

on November 2, 2025

Your review

Tags

Paper Aliases

No aliases