2019
Cite Score
87
AI summary
The paper introduces XLNet, a generalized autoregressive pretraining method, which enables learning bidirectional contexts, overcomes BERT limitations via autoregressive formulation, and integrates Transformer-XL ideas, achieving superior performance on 20 tasks.
Main Contributions
Abstract
With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling. However, relying on corrupting the input with masks, BERT neglects dependency between the masked positions and suffers from a pretrain-finetune discrepancy. In light of these pros and cons, we propose XLNet, a generalized autoregressive pretraining method that (1) enables learning bidirectional contexts by maximizing the expected likelihood over all permutations of the factorization order and (2) overcomes the limitations of BERT thanks to its autoregressive formulation. Furthermore, XLNet integrates ideas from Transformer-XL, the state-of-the-art autoregressive model, into pretraining. Empirically, under comparable experiment settings, XLNet outperforms BERT on 20 tasks, often by a large margin, including question answering, natural language inference, sentiment analysis, and document ranking.
Citation Graph
References [40]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin - 2017
47 papers in library cite
Jacob Devlin, M. W. Chang, K. Lee, Kristina Toutanova - 2018
39 papers in library cite
Yibo Liu, M. Ott, N. Goyal, J. Du, M. Joshi, Deli Chen, Omer Levy, Martha Lewis, Luke Zettlemoyer, Veselin Stoyanov - 2019
17 papers in library cite
M. E. Peters, M. Neumann, M. Iyyer, Matt Gardner, C. Clark, K. Lee, L. S. Zettlemoyer - 2018
27 papers in library cite
Alec Radford, K. Narasimhan, T. Salimans, Ilya Sutskever - 2018
23 papers in library cite
P. Rajpurkar, J. Zhang, K. Lopyrev, Percy Liang - 2016
37 papers in library cite
Z. Lan, Mark Chen, S. Goodman, Kevin Gimpel, P. Sharma, Radu Soricut - 2019
8 papers in library cite
A. Wang, A. Singh, J. Michael, F. Hill, Omer Levy, Samuel R. Bowman - 2018
26 papers in library cite
J. Howard, Sebastian Ruder - 2018
14 papers in library cite
Z. Dai, Zhilin Yang, Yining Yang, W. Cohen, J. Carbonell, Quoc Le, Ruslan Salakhutdinov - 2019
9 papers in library cite
John Richardson - 2018
3 papers in library cite
P. Rajpurkar, R. Jia, Percy Liang - 2018
14 papers in library cite
Yuxuan Zhu, R. Kiros, R. Zemel, Ruslan Salakhutdinov, R. Urtasun, Antonio Torralba, Sanja Fidler - 2015
18 papers in library cite
A. M. Dai, Quoc V. Le - 2015
27 papers in library cite
Guokun Lai, Q. Xie, Haozhe Liu, Yining Yang, Eduard Hovy - 2017
11 papers in library cite
B. Mccann, J. Bradbury, Caiming Xiong, Richard Socher - 2017
14 papers in library cite
R. A. Rfou, D. Choe, Noah Constant, M. Guo, Llion Jones - 2018
6 papers in library cite
A. Baevski, Michael Auli - 2018
3 papers in library cite
X. Zhang, J. Zhao, Yann Lecun - 2015
7 papers in library cite
Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao - 2019
6 papers in library cite
R. Johnson, Tong Zhang - 2017
2 papers in library cite
William Fedus, I. Goodfellow, A. M. Dai - 2018
2 papers in library cite
A. V. D. Oord, N. Kalchbrenner, Koray Kavukcuoglu - 2016
3 papers in library cite
V. Kocijan, A. M. Cretu, O. M. Camburu, Y. Yordanov, T. Lukasiewicz - 2019
4 papers in library cite
T. Miyato, A. M. Dai, I. Goodfellow - 2016
4 papers in library cite
Zhilin Yang, Z. Dai, Ruslan Salakhutdinov, W. W. Cohen - 2017
4 papers in library cite
K. Clark, M. Luong, U. Khandelwal, C. Manning, Quoc Le - 2019
3 papers in library cite
Yoshua Bengio, Samy Bengio - 2000
3 papers in library cite
J. Guo, Yu Fan, Q. Ai, W. B. Croft - 2016
2 papers in library cite
J. Callan, M. Hoy, C. Yoo, L. Zhao - 2009
2 papers in library cite
C. Crawl - 2019
2 papers in library cite
Z. Dai, Caiming Xiong, J. Callan, Ze Liu - 2018
2 papers in library cite
S. Zhang, H. Zhao, Yonghui Wu, Zhengyou Zhang, Xinyu Zhou, Xinyu Zhou - 2019
2 papers in library cite
Caiming Xiong, Z. Dai, J. Callan, Ze Liu, Russell Power - 2017
2 papers in library cite
R. Parker, D. Graff, J. Kong, K. Chen, K. Maeda - 2011
1 paper in library cites
Xuehai Pan, Ke Sun, D. Yu, H. Ji, D. Yu - 2019
1 paper in library cites
M. Germain, K. Gregor, I. Murray, Hugo Larochelle - 2015
1 paper in library cites
B. Uria, M. A. Cote, K. Gregor, I. Murray, Hugo Larochelle - 2016
1 paper in library cites
D. S. Sachan, M. Zaheer, Ruslan Salakhutdinov - 2018
1 paper in library cites
Q. Xie, Z. Dai, Eduard Hovy, M. T. Luong, Quoc V. Le - 2019
1 paper in library cites
Cited by
11
papers in your library
Cites
23
papers in your library
Read
on November 17, 2025
Your review
Tags
Paper Aliases
No aliases