Cite Score
51
AI summary
This paper introduces cross-lingual language models (XLMs) with unsupervised and supervised objectives, achieving state-of-the-art results on XNLI cross-lingual classification with a 4.9% gain, 34.3 BLEU on unsupervised WMT'16 German-English translation (9 BLEU improvement), and 38.5 BLEU on supervised WMT'16 Romanian-English translation (4 BLEU improvement).
Main Contributions
Abstract
Recent studies have demonstrated the efficiency of generative pretraining for English natural language understanding. In this work, we extend this approach to multiple languages and show the effectiveness of cross-lingual pretraining. We propose two methods to learn cross-lingual language models (XLMs): one unsupervised that only relies on monolingual data, and one supervised that leverages parallel data with a new cross-lingual language model objective. We obtain state-of-the-art results on cross-lingual classification, unsupervised and supervised machine translation. On XNLI, our approach pushes the state of the art by an absolute gain of 4.9% accuracy. On unsupervised machine translation, we obtain 34.3 BLEU on WMT'16 German-English, improving the previous state of the art by more than 9 BLEU. On supervised machine translation, we obtain a new state of the art of 38.5 BLEU on WMT'16 Romanian-English, outperforming the previous best approach by more than 4 BLEU. Our code and pretrained models will be made publicly available.
Citation Graph
References [45]
D. P. Kingma, Jimmy Lei Ba - 2014
49 papers in library cite
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin - 2017
47 papers in library cite
Jacob Devlin, M. W. Chang, K. Lee, Kristina Toutanova - 2018
39 papers in library cite
Sepp Hochreiter, Jürgen Schmidhuber - 1997
94 papers in library cite
Tomas Mikolov, Ilya Sutskever, K. Chen, G. S. Corrado, Jeffrey Dean - 2013
32 papers in library cite
A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. Devito, Zongyu Lin, A. Desmaison, L. Antiga, Adam Lerer - 2017
3 papers in library cite
Alec Radford, K. Narasimhan, T. Salimans, Ilya Sutskever - 2018
23 papers in library cite
Tomas Mikolov - 2017
7 papers in library cite
Richard Socher, A. Perelygin, Jeffrey Wu, J. Chuang, C. Manning, A. Ng, Christopher Potts - 2013
24 papers in library cite
R. Sennrich, B. Haddow, Alexandra Birch - 2016
22 papers in library cite
A. Wang, A. Singh, J. Michael, F. Hill, Omer Levy, Samuel R. Bowman - 2018
26 papers in library cite
Dan Hendrycks, Kevin Gimpel - 2016
9 papers in library cite
Tomas Mikolov, M. Karafiat, Lukas Burget, Jan Cernocky, Sanjeev Khudanpur - 2010
36 papers in library cite
P. Werbos - 1990
9 papers in library cite
J. Howard, Sebastian Ruder - 2018
14 papers in library cite
Samuel R. Bowman, G. Angeli, Christopher Potts, Christopher D. Manning - 2015
25 papers in library cite
A. Williams, Nikita Nangia, S. Bowman - 2018
19 papers in library cite
Tomas Mikolov, Quoc V. Le, Ilya Sutskever - 2013
6 papers in library cite
R. Jozefowicz, Oriol Vinyals, M. Schuster, Noam Shazeer, Yonghui Wu - 2016
20 papers in library cite
Alexis Conneau, G. Lample, Marc'aurelio Ranzato, L. Denoyer, Hervé Jégou - 2018
3 papers in library cite
M. Artetxe, G. Labaka, E. Agirre, Kyunghyun Cho - 2017
4 papers in library cite
R. A. Rfou, D. Choe, Noah Constant, M. Guo, Llion Jones - 2018
6 papers in library cite
P. Ramachandran, P. J. Liu, Quoc V. Le - 2017
9 papers in library cite
M. J. Johnson, M. Schuster, Quoc V. Le, M. Krikun, Yonghui Wu, Ziru Chen, N. Thorat, F. B. Viegas, M. Wattenberg, G. S. Corrado, M. Hughes, Jeffrey Dean - 2017
7 papers in library cite
G. Lample, L. Denoyer, Marc'aurelio Ranzato - 2017
4 papers in library cite
Alexis Conneau, Douwe Kiela - 2018
5 papers in library cite
S. L. Smith, D. H. Turban, S. Hamblin, N. Y. Hammerla - 2017
4 papers in library cite
P. Koehn, H. Hoang, Alexandra Birch, Chris Callison Burch, M. Federico, N. Bertoldi, B. Cowan, W. Shen, C. Moran, R. Zens, C. Dyer, O. Bojar, A. Constantin, E. Herbst - 2007
8 papers in library cite
R. Sennrich, B. Haddow, Alexandra Birch - 2016
5 papers in library cite
W. L. Taylor - 1953
4 papers in library cite
Manaal Faruqui, C. Dyer - 2014
2 papers in library cite
W. Ammar, G. Mulcaire, Y. Tsvetkov, G. Lample, C. Dyer, Noah A. Smith - 2016
2 papers in library cite
K. M. Hermann, Phil Blunsom - 2014
2 papers in library cite
C. Xing, D. Wang, C. L. Liu, Yutong Lin - 2015
2 papers in library cite
J. Tiedemann - 2012
2 papers in library cite
J. C. Collados, M. T. Pilehvar, N. Collier, R. Navigli - 2017
2 papers in library cite
M. Artetxe, Holger Schwenk - 2018
1 paper in library cites
P. C. Chang, M. Galley, Christopher D. Manning - 2008
1 paper in library cites
G. Lample, M. Ott, Alexis Conneau, L. Denoyer, Marc'aurelio Ranzato - 2018
1 paper in library cites
K. Anoop, M. Pratik, B. Pushpak - 2018
1 paper in library cites
M. Ziemski, M. J. Dowmunt, B. Pouliquen - 2016
1 paper in library cites
Z. Dai, Zhilin Yang, Yining Yang, W. W. Cohen, J. Carbonell, Quoc V. Le, Ruslan Salakhutdinov - 2019
1 paper in library cites
T. Wada, T. Iwata - 2018
1 paper in library cites
Alexis Conneau, R. Rinott, G. Lample, A. Williams, Samuel R. Bowman, Holger Schwenk, Veselin Stoyanov - 2018
1 paper in library cites
A. Eriguchi, M. J. Johnson, O. Firat, H. Kazawa, W. Macherey - 2018
1 paper in library cites
Cited by
5
papers in your library
Cites
27
papers in your library
Read
on November 17, 2025
Your review
Tags
Paper Aliases
No aliases