2018

BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, M. W. Chang, K. Lee, Kristina Toutanova

citations

Cite Score

99

AI summary

This paper introduces BERT, a new language representation model leveraging bidirectional transformers, achieving state-of-the-art results on GLUE, MultiNLI, and SQUAD benchmarks. It pre-trains deep bidirectional representations from unlabeled text and can be fine-tuned for various tasks without significant task-specific modifications.

Main Contributions

  • Demonstrates the importance of bidirectional pre-training for language representations using masked language models.
  • Shows that pre-trained representations reduce the need for heavily-engineered task-specific architectures.
  • Achieves state-of-the-art performance on a large suite of sentence-level and token-level tasks.
  • Advances the state of the art for eleven NLP tasks.
  • Introduces a next sentence prediction task that jointly pre-trains text-pair representations.

Abstract

We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models (Peters et al., 2018a; Radford et al., 2018), BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT model can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks, such as question answering and language inference, without substantial task-specific architecture modifications. BERT is conceptually simple and empirically powerful. It obtains new state-of-the-art results on eleven natural language processing tasks, including pushing the GLUE score to 80.5% (7.7% point absolute improvement), MultiNLI accuracy to 86.7% (4.6% absolute improvement), SQUAD v1.1 question answering Test F1 to 93.2 (1.5 point absolute improvement) and SQUAD v2.0 Test F1 to 83.1 (5.1 point absolute improvement).

Citation Graph

Loading graph...

References [56]

Sort:
Filter:

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin - 2017

47 papers in library cite

J. Deng, W. Dong, Richard Socher, L. J. Li, K. Li, Li Fei Fei - 2009

28 papers in library cite

Tomas Mikolov, Ilya Sutskever, K. Chen, G. S. Corrado, Jeffrey Dean - 2013

32 papers in library cite

Jeffrey Pennington, Richard Socher, Christopher D. Manning - 2014

31 papers in library cite

M. E. Peters, M. Neumann, M. Iyyer, Matt Gardner, C. Clark, K. Lee, L. S. Zettlemoyer - 2018

27 papers in library cite

Quoc Le, Tomas Mikolov - 2014

13 papers in library cite

Hod Lipson - 2014

2 papers in library cite

Richard Socher, A. Perelygin, Jeffrey Wu, J. Chuang, C. Manning, A. Ng, Christopher Potts - 2013

24 papers in library cite

P. Rajpurkar, J. Zhang, K. Lopyrev, Percy Liang - 2016

37 papers in library cite

Pascal Vincent, Hugo Larochelle, Yoshua Bengio, Pierre Antoine Manzagol - 2008

25 papers in library cite

Yonghui Wu, M. Schuster, Ziru Chen, Quoc V. Le, M. Norouzi, W. Macherey, M. Krikun, Yue Cao, Q. Gao, K. Macherey, J. Klingner, A. Shah, M. J. Johnson, Xiaodong Liu, Lukasz Kaiser, S. Gouws, Y. Kato, T. Kudo, H. Kazawa, K. Stevens, G. Kurian, N. Patil, Wenyi Wang, C. Young, J. Smith, J. Riesa, A. Rudnick, Oriol Vinyals, G. S. Corrado, M. Hughes, Jeffrey Dean - 2016

15 papers in library cite

A. Wang, A. Singh, J. Michael, F. Hill, Omer Levy, Samuel R. Bowman - 2018

26 papers in library cite

Dan Hendrycks, Kevin Gimpel - 2016

9 papers in library cite

Ronan Collobert, Jason Weston - 2008

32 papers in library cite

J. Howard, Sebastian Ruder - 2018

14 papers in library cite

Samuel R. Bowman, G. Angeli, Christopher Potts, Christopher D. Manning - 2015

25 papers in library cite

A. Williams, Nikita Nangia, S. Bowman - 2018

19 papers in library cite

Yuxuan Zhu, R. Kiros, R. Zemel, Ruslan Salakhutdinov, R. Urtasun, Antonio Torralba, Sanja Fidler - 2015

18 papers in library cite

M. Joshi, E. Choi, D. Weld, Luke Zettlemoyer - 2017

18 papers in library cite

R. Kiros, Yuxuan Zhu, Ruslan Salakhutdinov, Richard S. Zemel, R. Urtasun, Antonio Torralba, Sanja Fidler - 2015

23 papers in library cite

J. Turian, L. Ratinov, Yoshua Bengio - 2010

17 papers in library cite

Alexis Conneau, Douwe Kiela, Holger Schwenk, L. Barrault, Antoine Bordes - 2017

11 papers in library cite

M. Seo, A. Kembhavi, Ali Farhadi, Hananneh Hajishirzi - 2017

13 papers in library cite

W. Dolan, Chris Brockett - 2005

9 papers in library cite

Hector J. Levesque, E. Davis, Leora Morgenstern - 2011

13 papers in library cite

Rie Kubota Ando, Tong Zhang - 2005

10 papers in library cite

A. P. Parikh, O. Tackstrom, Dipanjan Das, Jakob Uszkoreit - 2016

11 papers in library cite

A. M. Dai, Quoc V. Le - 2015

27 papers in library cite

C. Chelba, Tomas Mikolov, M. Schuster, Q. Ge, T. Brants, P. Koehn, Tony Robinson - 2013

13 papers in library cite

B. Mccann, J. Bradbury, Caiming Xiong, Richard Socher - 2017

14 papers in library cite

A. Mnih, Geoffrey E. Hinton - 2009

16 papers in library cite

M. E. Peters, W. Ammar, C. Bhagavatula, Russell Power - 2017

5 papers in library cite

Yejin Choi - 2018

5 papers in library cite

F. Hill, Kyunghyun Cho, Anna Korhonen - 2016

12 papers in library cite

O. Melamud, J. Goldberger, Ido Dagan - 2016

5 papers in library cite

C. Clark, Matt Gardner - 2017

7 papers in library cite

R. A. Rfou, D. Choe, Noah Constant, M. Guo, Llion Jones - 2018

6 papers in library cite

Alex Warstadt, A. Singh, S. Bowman - 2018

8 papers in library cite

William Fedus, I. Goodfellow, A. M. Dai - 2018

2 papers in library cite

P. F. Brown, P. V. Desouza, R. L. Mercer, Vincent J. Della Pietra, J. C. Lai - 1992

12 papers in library cite

L. Bentivogli, Peter Clark, Ido Dagan, D. Giampiccolo - 2009

7 papers in library cite

D. Cer, M. Diab, E. Agirre, I. L. Gazpio, L. Specia - 2017

6 papers in library cite

W. L. Taylor - 1953

4 papers in library cite

Yacine Jernite, S. Bowman, D. Sontag - 2017

4 papers in library cite

M. E. Peters, M. Neumann, Luke Zettlemoyer, W. T. Yih - 2018

4 papers in library cite

John Blitzer, R. Mcdonald, Fernando Pereira - 2006

4 papers in library cite

Alec Radford, K. Narasimhan, T. Salimans, Ilya Sutskever - 2018

4 papers in library cite

E. F. T. K. Sang, F. D. Meulder - 2003

4 papers in library cite

L. Logeswaran, Honglak Lee - 2018

3 papers in library cite

A. W. Yu, David Dohan, M. T. Luong, R. Zhao, K. Chen, M. Norouzi, Quoc V. Le - 2018

3 papers in library cite

M. Hu, Y. Peng, Zhongqiang Huang, X. Qiu, F. Wei, M. Zhou - 2018

3 papers in library cite

Ziru Chen, Haowei Zhang, X. Zhang, L. Zhao - 2018

2 papers in library cite

A. Akbik, D. Blythe, R. Vollgraf - 2018

1 paper in library cites

Wenyi Wang, Minghao Yan, Chiyu Wu - 2018

1 paper in library cites

K. Clark, M. T. Luong, Christopher D. Manning, Quoc Le - 2018

1 paper in library cites

F. Sun, Lei Li, X. Qiu, Yibo Liu - 2018

1 paper in library cites

Cited by

39

papers in your library

Cites

39

papers in your library

Read

on October 17, 2025

Your review

Tags

Paper Aliases

No aliases