2019

Cross-Lingual Language Model Pretraining

G. Lample, Alexis Conneau

citations

Cite Score

51

AI summary

This paper introduces cross-lingual language models (XLMs) with unsupervised and supervised objectives, achieving state-of-the-art results on XNLI cross-lingual classification with a 4.9% gain, 34.3 BLEU on unsupervised WMT'16 German-English translation (9 BLEU improvement), and 38.5 BLEU on supervised WMT'16 Romanian-English translation (4 BLEU improvement).

Main Contributions

  • Introduces a new unsupervised method for learning cross-lingual representations using cross-lingual language modeling and investigates two monolingual pretraining objectives.
  • Introduces a new supervised learning objective that improves cross-lingual pretraining when parallel data is available.
  • Significantly outperforms the previous state of the art on cross-lingual classification, unsupervised machine translation and supervised machine translation.
  • Shows that cross-lingual language models can provide significant improvements on the perplexity of low-resource languages.
  • Makes code and pretrained models publicly available.

Abstract

Recent studies have demonstrated the efficiency of generative pretraining for English natural language understanding. In this work, we extend this approach to multiple languages and show the effectiveness of cross-lingual pretraining. We propose two methods to learn cross-lingual language models (XLMs): one unsupervised that only relies on monolingual data, and one supervised that leverages parallel data with a new cross-lingual language model objective. We obtain state-of-the-art results on cross-lingual classification, unsupervised and supervised machine translation. On XNLI, our approach pushes the state of the art by an absolute gain of 4.9% accuracy. On unsupervised machine translation, we obtain 34.3 BLEU on WMT'16 German-English, improving the previous state of the art by more than 9 BLEU. On supervised machine translation, we obtain a new state of the art of 38.5 BLEU on WMT'16 Romanian-English, outperforming the previous best approach by more than 4 BLEU. Our code and pretrained models will be made publicly available.

Citation Graph

Loading graph...

References [45]

Sort:
Filter:

D. P. Kingma, Jimmy Lei Ba - 2014

49 papers in library cite

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin - 2017

47 papers in library cite

Jacob Devlin, M. W. Chang, K. Lee, Kristina Toutanova - 2018

39 papers in library cite

Sepp Hochreiter, Jürgen Schmidhuber - 1997

94 papers in library cite

Tomas Mikolov, Ilya Sutskever, K. Chen, G. S. Corrado, Jeffrey Dean - 2013

32 papers in library cite

A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. Devito, Zongyu Lin, A. Desmaison, L. Antiga, Adam Lerer - 2017

3 papers in library cite

Alec Radford, K. Narasimhan, T. Salimans, Ilya Sutskever - 2018

23 papers in library cite

Tomas Mikolov - 2017

7 papers in library cite

Richard Socher, A. Perelygin, Jeffrey Wu, J. Chuang, C. Manning, A. Ng, Christopher Potts - 2013

24 papers in library cite

R. Sennrich, B. Haddow, Alexandra Birch - 2016

22 papers in library cite

A. Wang, A. Singh, J. Michael, F. Hill, Omer Levy, Samuel R. Bowman - 2018

26 papers in library cite

Dan Hendrycks, Kevin Gimpel - 2016

9 papers in library cite

Tomas Mikolov, M. Karafiat, Lukas Burget, Jan Cernocky, Sanjeev Khudanpur - 2010

36 papers in library cite

P. Werbos - 1990

9 papers in library cite

J. Howard, Sebastian Ruder - 2018

14 papers in library cite

Samuel R. Bowman, G. Angeli, Christopher Potts, Christopher D. Manning - 2015

25 papers in library cite

A. Williams, Nikita Nangia, S. Bowman - 2018

19 papers in library cite

Tomas Mikolov, Quoc V. Le, Ilya Sutskever - 2013

6 papers in library cite

R. Jozefowicz, Oriol Vinyals, M. Schuster, Noam Shazeer, Yonghui Wu - 2016

20 papers in library cite

Alexis Conneau, G. Lample, Marc'aurelio Ranzato, L. Denoyer, Hervé Jégou - 2018

3 papers in library cite

M. Artetxe, G. Labaka, E. Agirre, Kyunghyun Cho - 2017

4 papers in library cite

R. A. Rfou, D. Choe, Noah Constant, M. Guo, Llion Jones - 2018

6 papers in library cite

P. Ramachandran, P. J. Liu, Quoc V. Le - 2017

9 papers in library cite

M. J. Johnson, M. Schuster, Quoc V. Le, M. Krikun, Yonghui Wu, Ziru Chen, N. Thorat, F. B. Viegas, M. Wattenberg, G. S. Corrado, M. Hughes, Jeffrey Dean - 2017

7 papers in library cite

G. Lample, L. Denoyer, Marc'aurelio Ranzato - 2017

4 papers in library cite

Alexis Conneau, Douwe Kiela - 2018

5 papers in library cite

S. L. Smith, D. H. Turban, S. Hamblin, N. Y. Hammerla - 2017

4 papers in library cite

P. Koehn, H. Hoang, Alexandra Birch, Chris Callison Burch, M. Federico, N. Bertoldi, B. Cowan, W. Shen, C. Moran, R. Zens, C. Dyer, O. Bojar, A. Constantin, E. Herbst - 2007

8 papers in library cite

R. Sennrich, B. Haddow, Alexandra Birch - 2016

5 papers in library cite

W. L. Taylor - 1953

4 papers in library cite

Manaal Faruqui, C. Dyer - 2014

2 papers in library cite

W. Ammar, G. Mulcaire, Y. Tsvetkov, G. Lample, C. Dyer, Noah A. Smith - 2016

2 papers in library cite

K. M. Hermann, Phil Blunsom - 2014

2 papers in library cite

C. Xing, D. Wang, C. L. Liu, Yutong Lin - 2015

2 papers in library cite

J. Tiedemann - 2012

2 papers in library cite

J. C. Collados, M. T. Pilehvar, N. Collier, R. Navigli - 2017

2 papers in library cite

M. Artetxe, Holger Schwenk - 2018

1 paper in library cites

P. C. Chang, M. Galley, Christopher D. Manning - 2008

1 paper in library cites

G. Lample, M. Ott, Alexis Conneau, L. Denoyer, Marc'aurelio Ranzato - 2018

1 paper in library cites

K. Anoop, M. Pratik, B. Pushpak - 2018

1 paper in library cites

M. Ziemski, M. J. Dowmunt, B. Pouliquen - 2016

1 paper in library cites

Z. Dai, Zhilin Yang, Yining Yang, W. W. Cohen, J. Carbonell, Quoc V. Le, Ruslan Salakhutdinov - 2019

1 paper in library cites

T. Wada, T. Iwata - 2018

1 paper in library cites

Alexis Conneau, R. Rinott, G. Lample, A. Williams, Samuel R. Bowman, Holger Schwenk, Veselin Stoyanov - 2018

1 paper in library cites

A. Eriguchi, M. J. Johnson, O. Firat, H. Kazawa, W. Macherey - 2018

1 paper in library cites

Cited by

5

papers in your library

Cites

27

papers in your library

Read

on November 17, 2025

Your review

Tags

Paper Aliases

No aliases