2020

Language Models Are Few-Shot Learners

Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei

citations

Cite Score

98

AI summary

This paper introduces GPT-3, a 175 billion parameter autoregressive language model, demonstrating strong few-shot performance on diverse NLP tasks, even achieving state-of-the-art results without fine-tuning, and can generate human-indistinguishable news articles.

Main Contributions

  • Introduced GPT-3, an autoregressive language model with 175 billion parameters, 10x larger than previous non-sparse models.
  • Demonstrated significant improvements in task-agnostic, few-shot performance, sometimes competitive with or surpassing state-of-the-art fine-tuning approaches.
  • Showed strong performance on various NLP tasks including translation, question-answering, cloze tasks, and on-the-fly reasoning.
  • Identified tasks where GPT-3's few-shot learning still struggles and methodological issues related to training on large web corpora.
  • Found that GPT-3 can generate news articles indistinguishable from human-written articles by human evaluators.

Abstract

Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few examples or from simple instructions – something which current NLP systems still largely struggle to do. Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model. GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic. At the same time, we also identify some datasets where GPT-3’s few-shot learning still struggles, as well as some datasets where GPT-3 faces methodological issues related to training on large web corpora. Finally, we find that GPT-3 can generate samples of news articles which human evaluators have difficulty distinguishing from articles written by humans. We discuss broader societal impacts of this finding and of GPT-3 in general.

Citation Graph

Loading graph...

References [143]

Sort:
Filter:

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin - 2017

47 papers in library cite

Jacob Devlin, M. W. Chang, K. Lee, Kristina Toutanova - 2018

39 papers in library cite

Tomas Mikolov, K. Chen, G. S. Corrado, Jeffrey Dean - 2013

26 papers in library cite

Jeffrey Pennington, Richard Socher, Christopher D. Manning - 2014

31 papers in library cite

I. Loshchilov, Frank Hutter - 2017

7 papers in library cite

Jeffrey Dean - 2015

6 papers in library cite

Yibo Liu, M. Ott, N. Goyal, J. Du, M. Joshi, Deli Chen, Omer Levy, Martha Lewis, Luke Zettlemoyer, Veselin Stoyanov - 2019

17 papers in library cite

Alec Radford, Jeffrey Wu, Rewon Child, D. Luan, Dario Amodei, Ilya Sutskever - 2019

27 papers in library cite

Chelsea Finn, P. Abbeel, Sergey Levine - 2017

4 papers in library cite

Alec Radford, K. Narasimhan, T. Salimans, Ilya Sutskever - 2018

23 papers in library cite

Martha Lewis, Yibo Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, Omer Levy, Veselin Stoyanov, Luke Zettlemoyer - 2019

6 papers in library cite

Zhilin Yang, Z. Dai, Yining Yang, J. Carbonell, Ruslan Salakhutdinov, Quoc V. Le - 2019

11 papers in library cite

Rich Caruana - 1997

13 papers in library cite

Thomas Wolf - 2019

6 papers in library cite

Z. Lan, Mark Chen, S. Goodman, Kevin Gimpel, P. Sharma, Radu Soricut - 2019

8 papers in library cite

P. Lewis, Ethan Perez, A. Piktus, F. Petroni, V. Karpukhin, N. Goyal, H. Kuttler, Martha Lewis, W. T. Yih, Tim Rocktaschel, Sebastian Riedel, K. Douwe - 2020

5 papers in library cite

Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, Dario Amodei - 2020

12 papers in library cite

J. Howard, Sebastian Ruder - 2018

14 papers in library cite

Z. Dai, Zhilin Yang, Yining Yang, W. Cohen, J. Carbonell, Quoc Le, Ruslan Salakhutdinov - 2019

9 papers in library cite

Peter Clark, Isaac Cowhey, Oren Etzioni, Tushar Khot, Ashish Sabharwal, Carissa Schoenick, Oyvind Tafjord - 2018

5 papers in library cite

Rowan Zellers, Ari Holtzman, Yonatan Bisk, Ali Farhadi, Yejin Choi - 2019

6 papers in library cite

T. Kwiatkowski, J. Palomaki, O. Rhinehart, Michael Collins, A. P. Parikh, C. Alberti, D. Epstein, Illia Polosukhin, M. Kelcey, Jacob Devlin, K. Lee, K. N. Toutanova, Llion Jones, M. W. Chang, Andrew Dai, Jakob Uszkoreit, Quoc Le, Slav Petrov - 2019

9 papers in library cite

P. Rajpurkar, R. Jia, Percy Liang - 2018

14 papers in library cite

M. Joshi, E. Choi, D. Weld, Luke Zettlemoyer - 2017

18 papers in library cite

Noam Shazeer, Azalia Mirhoseini, K. Maziarz, A. Davis, Quoc Le, Geoffrey Hinton, Jeffrey Dean - 2017

9 papers in library cite

A. Wang, Y. Pruksachatkun, Nikita Nangia, A. Singh, J. Michael, F. Hill, Omer Levy, Samuel R. Bowman - 2019

15 papers in library cite

Ido Dagan, O. Glickman, Bernardo Magnini - 2005

19 papers in library cite

M. Shoeybi, M. Patwary, Raul Puri, P. Legresley, J. Casper, Bryan Catanzaro - 2019

3 papers in library cite

Sepp Hochreiter, A. Steven Younger, Peter R. Conwell - 2001

4 papers in library cite

Oren Etzioni - 2019

4 papers in library cite

Geoffrey Irving - 2020

7 papers in library cite

Hector J. Levesque, E. Davis, Leora Morgenstern - 2011

13 papers in library cite

G. Lample, Alexis Conneau - 2019

5 papers in library cite

A. M. Dai, Quoc V. Le - 2015

27 papers in library cite

Guokun Lai, Q. Xie, Haozhe Liu, Yining Yang, Eduard Hovy - 2017

11 papers in library cite

R. T. Mccoy, Ellie Pavlick, Tal Linzen - 2019

5 papers in library cite

R. Jozefowicz, Oriol Vinyals, M. Schuster, Noam Shazeer, Yonghui Wu - 2016

20 papers in library cite

Siva Reddy, Deli Chen, Christopher D. Manning - 2018

6 papers in library cite

Suchin Gururangan, Swabha Swayamdipta, Omer Levy, Richard Schwartz, S. Bowman, Noah A. Smith - 2018

6 papers in library cite

B. Mccann, J. Bradbury, Caiming Xiong, Richard Socher - 2017

14 papers in library cite

P. J. Liu, M. Saleh, E. Pot, B. Goodrich, R. Sepassi, Lukasz Kaiser, Noam Shazeer - 2018

7 papers in library cite

Noam Shazeer - 2020

2 papers in library cite

D. Paperno, German Kruszewski, A. Lazaridou, N. Q. Pham, R. Bernardi, S. Pezzelle, M. Baroni, G. Boleda, Raquel Fernandez - 2016

12 papers in library cite

Richard Socher - 2018

9 papers in library cite

D. Yogatama, C. D. M. D'autume, J. Connor, T. Kocisky, M. Chrzanowski, L. Kong, A. Lazaridou, W. Ling, Longhui Yu, C. Dyer - 2019

2 papers in library cite

Colin Raffel, Noam Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, Wentao Li, P. J. Liu - 2019

17 papers in library cite

Oriol Vinyals, C. Blundell, T. Lillicrap, Daan Wierstra - 2016

2 papers in library cite

Ari Holtzman, J. Buys, L. Du, M. Forbes, Yejin Choi - 2019

5 papers in library cite

S. Ravi, Hugo Larochelle - 2017

2 papers in library cite

R. Sennrich, B. Haddow, Alexandra Birch - 2016

4 papers in library cite

K. Sakaguchi, R. L. Bras, C. Bhagavatula, Yejin Choi - 2019

4 papers in library cite

M. Mitchell, S. Wu, A. Zaldivar, P. Barnes, L. Vasserman, B. Hutchinson, E. Spitzer, I. D. Raji, T. Gebru - 2018

5 papers in library cite

Yonatan Bisk, Rowan Zellers, R. L. Bras, Jianfeng Gao, Yejin Choi - 2019

5 papers in library cite

Rewon Child, Scott Gray, Alec Radford, Ilya Sutskever - 2019

5 papers in library cite

K. Guu, K. Lee, Z. Tung, P. Panupong, M. W. Chang - 2020

5 papers in library cite

C. Clark, K. Lee, M. W. Chang, T. Kwiatkowski, Michael Collins, Kristina Toutanova - 2019

4 papers in library cite

Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao - 2019

6 papers in library cite

Rowan Zellers, Ari Holtzman, Hannah Rashkin, Yonatan Bisk, Ali Farhadi, F. Roesner, Yejin Choi - 2019

5 papers in library cite

J. Hestness, S. Narang, N. Ardalani, G. Diamos, Heewoo Jun, H. Kianinejad, M. Patwary, M. Ali, Yining Yang, Y. Zhou - 2017

5 papers in library cite

Y. Duan, John Schulman, X. Chen, P. L. Bartlett, Ilya Sutskever, P. Abbeel - 2016

2 papers in library cite

K. Song, X. Tan, T. Qin, J. Lu, T. Y. Liu - 2019

5 papers in library cite

E. Choi, He He, M. Iyyer, M. Yatskar, W. T. Yih, Yejin Choi, Percy Liang, Luke Zettlemoyer - 2018

8 papers in library cite

T. H. Trinh, Quoc V. Le - 2018

4 papers in library cite

Sam McCandlish, Jared Kaplan, Dario Amodei, O. D. Team - 2018

3 papers in library cite

R. Rudinger, J. Naradowsky, B. Leonard, B. V. Durme - 2018

6 papers in library cite

Mostafa Dehghani, S. Gouws, Oriol Vinyals, Jakob Uszkoreit, Lukasz Kaiser - 2018

6 papers in library cite

D. Dua, Yuzhi Wang, P. Dasigi, G. Stanovsky, Shivalika Singh, Matt Gardner - 2019

4 papers in library cite

Yoshua Bengio, N. Leonard, Aaron Courville - 2013

3 papers in library cite

Y. C. Chen, Lei Li, Longhui Yu, A. E. Kholy, F. Ahmed, Z. Gan, Y. Cheng, Joseph Liu - 2019

1 paper in library cites

Daniel Khashabi, Tushar Khot, Ashish Sabharwal, Oyvind Tafjord, Peter Clark, Hananneh Hajishirzi - 2020

5 papers in library cite

Jonathan Berant, A. Chou, R. Frostig, Percy Liang - 2013

8 papers in library cite

S. L. Blodgett, S. Barocas, H. D. Iii, H. Wallach - 2020

7 papers in library cite

L. Bentivogli, Peter Clark, Ido Dagan, D. Giampiccolo - 2009

7 papers in library cite

D. Giampiccolo, Bernardo Magnini, Ido Dagan, B. Dolan - 2007

7 papers in library cite

T. Mihaylov, Peter Clark, Tushar Khot, Ashish Sabharwal - 2018

6 papers in library cite

N. Durrani, B. Haddow, P. Koehn, K. Heafield - 2014

6 papers in library cite

R. B. Haim, Ido Dagan, B. Dolan, L. Ferro, D. Giampiccolo, Bernardo Magnini, I. Szpektor - 2006

6 papers in library cite

J. S. Rosenfeld, A. Rosenfeld, Yonatan Belinkov, N. Shavit - 2019

5 papers in library cite

N. Mostafazadeh, N. Chambers, X. He, D. Parikh, D. Batra, L. Vanderwende, P. Kohli, J. Allen - 2016

5 papers in library cite

Y. Fyodorov, Y. Winter, N. Francez - 2000

4 papers in library cite

M. Roemmele, C. A. Bejan, A. S. Gordon - 2011

4 papers in library cite

A. Poliak, A. Haldar, R. Rudinger, J. E. Hu, Ellie Pavlick, A. S. White, B. V. Durme - 2018

4 papers in library cite

M. E. Peters, M. Neumann, Luke Zettlemoyer, W. T. Yih - 2018

4 papers in library cite

Daniel Khashabi, S. Chaturvedi, M. Roth, Shyam Upadhyay, Dan Roth - 2018

4 papers in library cite

S. Zhang, Xiaodong Liu, Joseph Liu, Jianfeng Gao, K. Duh, B. V. Durme - 2018

4 papers in library cite

Moin Nadeem, A. Bethke, Siva Reddy - 2020

4 papers in library cite

E. Sheng, K. W. Chang, P. Natarajan, Nanyun Peng - 2019

4 papers in library cite

M. Iyyer, J. B. Graber, L. Claudino, Richard Socher - 2014

3 papers in library cite

Y. Nie, A. Williams, E. Dinan, Mohit Bansal, Jason Weston, Douwe Kiela - 2019

3 papers in library cite

Yonatan Bisk, Ari Holtzman, J. Thomason, Jacob Andreas, Yoshua Bengio, J. Chai, Mirella Lapata, A. Lazaridou, J. May, A. Nisnevich - 2020

3 papers in library cite

Tal Linzen - 2020

3 papers in library cite

Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao - 2019

3 papers in library cite

M. Andrychowicz, M. Denil, S. Gomez, M. W. Hoffman, D. Pfau, T. Schaul, N. D. Freitas - 2016

3 papers in library cite

I. Solaiman, Miles Brundage, Jack Clark, Amanda Askell, Ariel Herbert-Voss, Jeffrey Wu, Alec Radford, Gretchen Krueger, J. W. Kim, S. Kreps, M. Mccain, A. Newhouse, J. Blazakis, K. Mcguffie, J. Wang - 2019

3 papers in library cite

Yoon Kim, A. Rush - 2016

3 papers in library cite

M. C. D. Marneffe, M. Simons, J. Tonhauser - 2019

3 papers in library cite

K. Crawford - 2017

3 papers in library cite

M. Post - 2018

2 papers in library cite

Alex Graves - 2016

2 papers in library cite

S. E. Kreps, M. Mccain, Miles Brundage - 2020

2 papers in library cite

Daphne Ippolito, D. Duckworth, Chris Callison Burch, D. Eck - 2019

2 papers in library cite

J. Tobin, R. Fong, Alex Ray, J. Schneider, Wojciech Zaremba, P. Abbeel - 2017

2 papers in library cite

Timo Schick, Hinrich Schutze - 2020

2 papers in library cite

Yibo Liu, J. Gu, N. Goyal, Xiang Lisa Li, S. Edunov, M. Ghazvininejad, Martha Lewis, Luke Zettlemoyer - 2020

2 papers in library cite

E. Loper, S. Bird - 2002

2 papers in library cite

T. Niven, H. Y. Kao - 2019

2 papers in library cite

Y. Qian, U. Muaz, B. Zhang, J. W. Hyun - 2019

2 papers in library cite

P. S. Huang, Haowei Zhang, R. Jiang, R. Stanforth, J. Welbl, J. Rae, Vishal Maini, D. Yogatama, P. Kohli - 2019

2 papers in library cite

Xiaodong Liu, Jianfeng Gao, X. He, K. Duh, Y. Y. Wang - 2015

2 papers in library cite

Jason Phang, T. Fevry, Samuel R. Bowman - 2018

2 papers in library cite

M. Marcus, G. Kim, Mary Ann Marcinkiewicz, R. Macintyre, A. Bies, M. Ferguson, K. Katz, B. Schasberger - 1994

2 papers in library cite

Q. Xie, Z. Dai, Eduard Hovy, M. T. Luong, Quoc V. Le - 2019

2 papers in library cite

S. Carey, E. Bartlett - 1978

1 paper in library cites

Xiaodong Liu, H. Cheng, Pengcheng He, Weizhu Chen, Yuzhi Wang, H. Poon, Jianfeng Gao - 2020

1 paper in library cites

Danny Hernandez, Tom B. Brown - 2020

1 paper in library cites

P. D. Turney, M. L. Littman, J. Bigham, V. Shnayder - 2003

1 paper in library cites

P. D. Turney, M. L. Littman - 2005

1 paper in library cites

M. Nissim, R. V. Noord, R. V. D. Goot - 2019

1 paper in library cites

Missing author list
[120]Fascha

2016

1 paper in library cites

S. Reed, Yanru Chen, T. Paine, A. V. D. Oord, S. Eslami, D. Rezende, Oriol Vinyals, N. D. Freitas - 2017

1 paper in library cites

Sebastian Gehrmann, Hendrik Strobelt, Alexander M. Rush - 2019

1 paper in library cites

R. S. Ross - 2012

1 paper in library cites

D. Mackay - 1992

1 paper in library cites

K. Li, Jitendra Malik - 2017

1 paper in library cites

R. Aharoni, M. J. Johnson, O. Firat - 2019

1 paper in library cites

J. Gu, Yuzhi Wang, Yanru Chen, Kyunghyun Cho, V. O. Li - 2018

1 paper in library cites

Yuzhi Wang, Y. Xia, T. He, F. Tian, T. Qin, C. Zhai, T. Y. Liu - 2019

1 paper in library cites

P. Norvig - 2009

1 paper in library cites

Z. Junyuan, G. L. Nyc - 2020

1 paper in library cites

Q. Ran, Yutong Lin, P. L. Li, Jingren Zhou, Ze Liu - 2019

1 paper in library cites

Dan Hendrycks, Xiaodong Liu, E. Wallace, A. Dziedzic, R. Krishnan, Dawn Song - 2020

1 paper in library cites

Missing author list

2020

1 paper in library cites

Missing author list

2020

1 paper in library cites

S. Baccianella, A. Esuli, F. Sebastiani - 2010

1 paper in library cites

Zhiyuan Li, X. Ding, T. Liu - 2019

1 paper in library cites

Y. Ju, F. Zhao, S. Chen, Bo Zheng, X. Yang, Yibo Liu - 2019

1 paper in library cites

X. Jiao, Y. Yin, L. Shang, Xu Jiang, X. Chen, Lei Li, Feng Wang, Qian Liu - 2019

1 paper in library cites

Missing author list

2019

1 paper in library cites

Zhiyuan Li, E. Wallace, S. Shen, K. Lin, Kurt Keutzer, Dan Klein, Joseph E. Gonzalez - 2020

1 paper in library cites

S. C. Lin, J. H. Yang, Rodrigo Nogueira, M. F. Tsai, C. J. Wang, Junyang Lin - 2020

1 paper in library cites

S. Edunov, M. Ott, Michael Auli, D. Grangier - 2018

1 paper in library cites

M. T. Pilehvar, J. C. Collados - 2018

1 paper in library cites

Cited by

21

papers in your library

Cites

69

papers in your library

Read

on May 23, 2026

Your review

Tags

Vetto Study

Paper Aliases

No aliases