Papperoni

2019

Language Models as Knowledge Bases?

F. Petroni, Tim Rocktaschel, P. Lewis, A. Bakhtin, Yonghui Wu, A. H. Miller, Sebastian Riedel

Open PDF Google Scholar

citations

Cite Score

68

AI summary

This paper introduces LAMA (LAnguage Model Analysis) probe to test the factual and commonsense knowledge in language models, finding that BERT contains relational knowledge competitive with traditional NLP methods and achieves remarkable results for open-domain QA.

Main Contributions

Introduces LAMA (LAnguage Model Analysis) probe to test the factual and commonsense knowledge in language models.
Finds that BERT contains relational knowledge competitive with traditional NLP methods that have some access to oracle knowledge.
Shows that BERT also does remarkably well on open-domain question answering against a supervised baseline.
Demonstrates that certain types of factual knowledge are learned much more readily than others by standard language model pretraining approaches.
Shows that BERT-large achieves remarkable results for open-domain QA, reaching 57.1% precision@10 compared to 63.5% of a knowledge base constructed using a task-specific supervised relation extraction system.

Abstract

Recent progress in pretraining language models on large textual corpora led to a surge of improvements for downstream NLP tasks. Whilst learning linguistic knowledge, these models may also be storing relational knowledge present in the training data, and may be able to answer queries structured as "fill-in-the-blank" cloze statements. Language models have many advantages over structured knowledge bases: they require no schema engineering, allow practitioners to query about an open class of relations, are easy to extend to more data, and require no human supervision to train. We present an in-depth analysis of the relational knowledge already present (without fine-tuning) in a wide range of state-of-the-art pretrained language models. We find that (i) without fine-tuning, BERT contains relational knowledge competitive with traditional NLP methods that have some access to oracle knowledge, (ii) BERT also does remarkably well on open-domain question answering against a supervised baseline, and (iii) certain types of factual knowledge are learned much more readily than others by standard language model pretraining approaches. The surprisingly strong ability of these models to recall factual knowledge without any fine-tuning demonstrates their potential as unsupervised open-domain QA systems. The code to reproduce our analysis is available at https://github.com/facebookresearch/LAMA.

Citation Graph

Loading graph...

References [39]

Sort:

Filter:

[1]Attention Is All You Need

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin - 2017

47 papers in library cite

I mean... it introduced Transformers!

[2]BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, M. W. Chang, K. Lee, Kristina Toutanova - 2018

39 papers in library cite

Simply amazing. It's very impressive how they make a leap vs. existing stuff (you can see from the references, pretty much no one is doing what they are doing, other than GPT)

[3]Long Short-Term Memory

Sepp Hochreiter, Jürgen Schmidhuber - 1997

94 papers in library cite

LSTMs FTW!

[4]Language Models Are Unsupervised Multitask Learners

Alec Radford, Jeffrey Wu, Rewon Child, D. Luan, Dario Amodei, Ilya Sutskever - 2019

27 papers in library cite

Amazing! Tons of important contributions. I think they could have explained the models a bit better, and I think this is where OpenAI starts to become evil (and not open)

[5]Deep Contextualized Word Representations

M. E. Peters, M. Neumann, M. Iyyer, Matt Gardner, C. Clark, K. Lee, L. S. Zettlemoyer - 2018

27 papers in library cite

I didn't really like the approach. Seems a bit derivative TBH. BERT seems more elegant.

[6]Improving Language Understanding by Generative Pre-Training

Alec Radford, K. Narasimhan, T. Salimans, Ilya Sutskever - 2018

23 papers in library cite

Very simple and very nice! Easy to understand and revolutionary maybe?

[7]A Neural Probabilistic Language Model

Yoshua Bengio, R. Ducharme, Pascal Vincent - 2001

62 papers in library cite

What started it all. Very simple and elegant.

[8]SQuAD: 100,000+ Questions for Machine Comprehension of Text

P. Rajpurkar, J. Zhang, K. Lopyrev, Percy Liang - 2016

37 papers in library cite

Nice paper that introduced an important dataset. Not much else though.

[9]GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding

A. Wang, A. Singh, J. Michael, F. Hill, Omer Levy, Samuel R. Bowman - 2018

26 papers in library cite

I like it, but it's just a mesh of different existing datasets and F1 score. Nothing new really but I get why it's important

[10]Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

Z. Dai, Zhilin Yang, Yining Yang, W. Cohen, J. Carbonell, Quoc Le, Ruslan Salakhutdinov - 2019

9 papers in library cite

It's so cool to see context expansion without the need to actually expand context! Such a simple context and so effective!

[11]Natural Questions: A Benchmark for Question Answering Research

T. Kwiatkowski, J. Palomaki, O. Rhinehart, Michael Collins, A. P. Parikh, C. Alberti, D. Epstein, Illia Polosukhin, M. Kelcey, Jacob Devlin, K. Lee, K. N. Toutanova, Llion Jones, M. W. Chang, Andrew Dai, Jakob Uszkoreit, Quoc Le, Slav Petrov - 2019

9 papers in library cite

The dataset and methodology is very nice - it's amazing to see how Google does the summaries in search. However, the paper is too complex with the math stuff - unnecessary.

[12]Recurrent Neural Network Regularization

Wojciech Zaremba, Ilya Sutskever, Oriol Vinyals - 2014

22 papers in library cite

It's a very simple idea and TBH it's nothing different from dropout. It's good that it's a very short paper and very straightforward, but could be a paragraph long.

[13]Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books

Yuxuan Zhu, R. Kiros, R. Zemel, Ruslan Salakhutdinov, R. Urtasun, Antonio Torralba, Sanja Fidler - 2015

18 papers in library cite

I think their approach was a bit convoluted and didn't really add a lot. Main contribution here is probably BookCorpus

[14]Pointer Sentinel Mixture Models

S. Merity, Caiming Xiong, J. Bradbury, Richard Socher - 2017

12 papers in library cite

I really liked the methodology, but I had to read it a few times to understand it intuitively - I think they should have done a better job at explaining it.

[15]Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference

R. T. Mccoy, Ellie Pavlick, Tal Linzen - 2019

5 papers in library cite

TBH I expected more. It's a very shallow analysis of the heuristics (but ok, props to them for making it simple). I think that ultimately they don't solve the problem.

[16]CoQA: A Conversational Question Answering Challenge

Siva Reddy, Deli Chen, Christopher D. Manning - 2018

6 papers in library cite

It's a fine paper and a solid addition to QA data + NLU.

[17]Context Dependent Recurrent Neural Network Language Model

Tomas Mikolov, Geoffrey Zweig - 2012

12 papers in library cite

Nothing too interesting, just using the context of the RNN.

[18]A Survey of Reinforcement Learning Informed by Natural Language

Jelena Luketina, Nantas Nardelli, Gregory Farquhar, Jakob Foerster, Jacob Andreas, Edward Grefenstette, Shimon Whiteson, Tim Rocktaschel - 2019

3 papers in library cite

It's a good overview and I like that it gave me context on what's happening, but a bit boring to read.

[19]Language Modeling With Gated Convolutional Networks

Yann N. Dauphin, A. Fan, Michael Auli, D. Grangier - 2016

8 papers in library cite

[20]Reading Wikipedia to Answer Open-Domain Questions

Deli Chen, Adam Fisch, Jason Weston, Antoine Bordes - 2017

10 papers in library cite

Open Domain QA with wikipedia

[21]On the State of the Art of Evaluation in Neural Language Models

G. Melis, C. Dyer, Phil Blunsom - 2018

6 papers in library cite

SotA LM in 2017

[22]Learning to Understand Goal Specifications by Modelling Reward

D. Bahdanau, F. Hill, Jan Leike, E. Hughes, P. Kohli, Edward Grefenstette - 2019

4 papers in library cite

[23]What Do You Learn From Context? Probing for Sentence Structure in Contextualized Word Representations

I. Tenney, P. Xia, Berlin Chen, A. Wang, A. Poliak, R. T. Mccoy, N. Kim, B. V. Durme, S. Bowman, Dipanjan Das, Ellie Pavlick - 2019

4 papers in library cite

Analysis of how transformers learn phrase structure

[24]Assessing bert's Syntactic Abilities

Y. Goldberg - 2019

4 papers in library cite

[25]Dissecting Contextual Word Embeddings: Architecture and Representation

M. E. Peters, M. Neumann, Luke Zettlemoyer, W. T. Yih - 2018

4 papers in library cite

[26]Commonsenseqa: A Question Answering Challenge Targeting Commonsense Knowledge

A. Talmor, J. Herzig, N. Lourie, Jonathan Berant - 2019

3 papers in library cite

[27]Don't Count, Predict! A Systematic Comparison of Context-Counting vs Context-Predicting Semantic Vectors

M. Baroni, G. Dinu, German Kruszewski - 2014

3 papers in library cite

[28]Simlex-999: Evaluating Semantic Models With (Genuine) Similarity Estimation

F. Hill, R. Reichart, Anna Korhonen - 2015

3 papers in library cite

[29]Targeted Syntactic Evaluation of Language Models

R. Marvin, Tal Linzen - 2018

3 papers in library cite

[30]From Recognition to Cognition: Visual Commonsense Reasoning

Rowan Zellers, Yonatan Bisk, Ali Farhadi, Yejin Choi - 2018

2 papers in library cite

[31]Learning to Win by Reading Manuals in a Monte-Carlo Framework

S. R. K. Branavan, D. Silver, R. Barzilay - 2011

2 papers in library cite

[32]A Review of Relational Machine Learning for Knowledge Graphs

M. Nickel, K. Murphy, V. Tresp, E. Gabrilovich - 2016

1 paper in library cites

[33]Babyai: First Steps Towards Grounded Language Learning With a Human in the Loop

M. C. Boisvert, D. Bahdanau, S. Lahlou, L. Willems, C. Saharia, T. H. Nguyen, Yoshua Bengio - 2018

1 paper in library cites

[34]Context-Aware Representations for Knowledge Base Relation Extraction

D. Sorokin, I. Gurevych - 2017

1 paper in library cites

[35]Non-Monotonic Sequential Text Generation

S. Welleck, K. Brantley, H. D. Iii, Kyunghyun Cho - 2019

1 paper in library cites

[36]Overview of the English Slot Filling Track at the TAC2014 Knowledge Base Population Evaluation

M. Surdeanu, H. Ji - 2014

1 paper in library cites

[37]Representing General Relational Knowledge in Conceptnet 5

R. Speer, C. Havasi - 2012

1 paper in library cites

[38]T-rex: A Large Scale Alignment of Natural Language With Knowledge Base Triples

H. Elsahar, P. Vougiouklis, A. Remaci, C. Gravier, J. Hare, F. Laforest, E. Simperl - 2018

1 paper in library cites

[39]Translating Embeddings for Modeling Multi-Relational Data

Antoine Bordes, Nicolas Usunier, A. G. Duran, Jason Weston, O. Yakhnenko - 2013

1 paper in library cites

Cited by

4

papers in your library

Cites

23

papers in your library

Read

on November 15, 2025

Very nice, but I expected a bit more. I thought it would be more of a philosophical discussion rather than a benchmark analysis. Still, probably the first ones to notice that LMs contain knowledge.

Tags

Paper Aliases

No aliases