2020

How Much Knowledge Can You Pack Into the Parameters of a Language Model?

Noam Shazeer

citations

Cite Score

39

AI summary

The paper explores fine-tuning pre-trained T5 language models to answer questions without external knowledge, demonstrating competitive performance with open-domain systems. It leverages T5 models and introduces a closed-book question answering approach. The main result shows performance scales with model size.

Main Contributions

  • Demonstrates that fine-tuning pre-trained language models can effectively answer questions without external knowledge.
  • Shows that this approach scales with model size, with larger models achieving better performance.
  • Achieves competitive results compared to open-domain question answering systems that explicitly retrieve information from external sources.
  • Releases code and trained models to facilitate reproducibility and future research.
  • Introduces a closed-book question answering paradigm where models must rely solely on internalized knowledge.

Abstract

It has recently been observed that neural language models trained on unstructured text can implicitly store and retrieve knowledge using natural language queries. In this short paper, we measure the practical utility of this approach by fine-tuning pre-trained models to answer questions without access to any external context or knowledge. We show that this approach scales with model size and performs competitively with open-domain systems that explicitly retrieve answers from an external knowledge source when answering questions. To facilitate reproducibility and future work, we release our code and trained models.

Citation Graph

Loading graph...

References [36]

Sort:
Filter:

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin - 2017

47 papers in library cite

Jacob Devlin, M. W. Chang, K. Lee, Kristina Toutanova - 2018

39 papers in library cite

Yibo Liu, M. Ott, N. Goyal, J. Du, M. Joshi, Deli Chen, Omer Levy, Martha Lewis, Luke Zettlemoyer, Veselin Stoyanov - 2019

17 papers in library cite

Alec Radford, Jeffrey Wu, Rewon Child, D. Luan, Dario Amodei, Ilya Sutskever - 2019

27 papers in library cite

Alec Radford, K. Narasimhan, T. Salimans, Ilya Sutskever - 2018

23 papers in library cite

Zhilin Yang, Z. Dai, Yining Yang, J. Carbonell, Ruslan Salakhutdinov, Quoc V. Le - 2019

11 papers in library cite

P. Rajpurkar, J. Zhang, K. Lopyrev, Percy Liang - 2016

37 papers in library cite

Z. Lan, Mark Chen, S. Goodman, Kevin Gimpel, P. Sharma, Radu Soricut - 2019

8 papers in library cite

J. Howard, Sebastian Ruder - 2018

14 papers in library cite

T. Kwiatkowski, J. Palomaki, O. Rhinehart, Michael Collins, A. P. Parikh, C. Alberti, D. Epstein, Illia Polosukhin, M. Kelcey, Jacob Devlin, K. Lee, K. N. Toutanova, Llion Jones, M. W. Chang, Andrew Dai, Jakob Uszkoreit, Quoc Le, Slav Petrov - 2019

9 papers in library cite

F. Petroni, Tim Rocktaschel, P. Lewis, A. Bakhtin, Yonghui Wu, A. H. Miller, Sebastian Riedel - 2019

4 papers in library cite

M. Joshi, E. Choi, D. Weld, Luke Zettlemoyer - 2017

18 papers in library cite

A. M. Dai, Quoc V. Le - 2015

27 papers in library cite

Colin Raffel, Noam Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, Wentao Li, P. J. Liu - 2019

17 papers in library cite

Deli Chen, Adam Fisch, Jason Weston, Antoine Bordes - 2017

10 papers in library cite

K. Guu, K. Lee, Z. Tung, P. Panupong, M. W. Chang - 2020

5 papers in library cite

C. Clark, K. Lee, M. W. Chang, T. Kwiatkowski, Michael Collins, Kristina Toutanova - 2019

4 papers in library cite

Noam Shazeer, M. Stern - 2018

3 papers in library cite

D. Dua, Yuzhi Wang, P. Dasigi, G. Stanovsky, Shivalika Singh, Matt Gardner - 2019

4 papers in library cite

Zhejun Jiang, F. F. Xu, J. Araki, Graham Neubig - 2019

2 papers in library cite

Jonathan Berant, A. Chou, R. Frostig, Percy Liang - 2013

8 papers in library cite

T. Mihaylov, Peter Clark, Tushar Khot, Ashish Sabharwal - 2018

6 papers in library cite

Daniel Khashabi, S. Chaturvedi, M. Roth, Shyam Upadhyay, Dan Roth - 2018

4 papers in library cite

S. Zhang, Xiaodong Liu, Joseph Liu, Jianfeng Gao, K. Duh, B. V. Durme - 2018

4 papers in library cite

V. Karpukhin, B. Ouguz, S. Min, L. Y. Wu, S. Edunov, Deli Chen, W. T. Yih - 2020

3 papers in library cite

T. Fevry, L. B. Soares, N. Fitzgerald, E. Choi, T. Kwiatkowski - 2020

2 papers in library cite

K. Lee, M. W. Chang, Kristina Toutanova - 2019

2 papers in library cite

A. Talmor, Y. Elazar, Y. Goldberg, Jonathan Berant - 2019

2 papers in library cite

S. Min, Deli Chen, Hananneh Hajishirzi, Luke Zettlemoyer - 2019

1 paper in library cites

G. Wenzek, M. A. Lachaux, Alexis Conneau, V. Chaudhary, F. Guzman, Armand Joulin, E. Grave - 2019

1 paper in library cites

K. Bollacker, C. Evans, P. Paritosh, T. Sturge, J. Taylor - 2008

1 paper in library cites

L. Pan, R. Chakravarti, A. Ferritto, M. Glass, A. Gliozzo, S. Roukos, R. Florian, A. Sil - 2019

1 paper in library cites

S. Min, Deli Chen, Luke Zettlemoyer, Hananneh Hajishirzi - 2019

1 paper in library cites

J. Ling, N. Fitzgerald, Z. Shan, L. B. Soares, T. Fevry, D. Weiss, T. Kwiatkowski - 2020

1 paper in library cites

A. Asai, K. Hashimoto, Hananneh Hajishirzi, Richard Socher, Caiming Xiong - 2019

1 paper in library cites

J. Prager - 2006

1 paper in library cites

Cited by

2

papers in your library

Cites

20

papers in your library

Read

on December 29, 2025

Your review

Tags

Paper Aliases

No aliases