2020
Language Models Are Few-Shot Learners
Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei
Cite Score
98
AI summary
This paper introduces GPT-3, a 175 billion parameter autoregressive language model, demonstrating strong few-shot performance on diverse NLP tasks, even achieving state-of-the-art results without fine-tuning, and can generate human-indistinguishable news articles.
Main Contributions
Abstract
Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few examples or from simple instructions – something which current NLP systems still largely struggle to do. Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model. GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic. At the same time, we also identify some datasets where GPT-3’s few-shot learning still struggles, as well as some datasets where GPT-3 faces methodological issues related to training on large web corpora. Finally, we find that GPT-3 can generate samples of news articles which human evaluators have difficulty distinguishing from articles written by humans. We discuss broader societal impacts of this finding and of GPT-3 in general.
Citation Graph
References [143]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin - 2017
47 papers in library cite
Jacob Devlin, M. W. Chang, K. Lee, Kristina Toutanova - 2018
39 papers in library cite
Tomas Mikolov, K. Chen, G. S. Corrado, Jeffrey Dean - 2013
26 papers in library cite
Jeffrey Pennington, Richard Socher, Christopher D. Manning - 2014
31 papers in library cite
I. Loshchilov, Frank Hutter - 2017
7 papers in library cite
Jeffrey Dean - 2015
6 papers in library cite
Yibo Liu, M. Ott, N. Goyal, J. Du, M. Joshi, Deli Chen, Omer Levy, Martha Lewis, Luke Zettlemoyer, Veselin Stoyanov - 2019
17 papers in library cite
Alec Radford, Jeffrey Wu, Rewon Child, D. Luan, Dario Amodei, Ilya Sutskever - 2019
27 papers in library cite
Chelsea Finn, P. Abbeel, Sergey Levine - 2017
4 papers in library cite
Alec Radford, K. Narasimhan, T. Salimans, Ilya Sutskever - 2018
23 papers in library cite
Martha Lewis, Yibo Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, Omer Levy, Veselin Stoyanov, Luke Zettlemoyer - 2019
6 papers in library cite
Zhilin Yang, Z. Dai, Yining Yang, J. Carbonell, Ruslan Salakhutdinov, Quoc V. Le - 2019
11 papers in library cite
Rich Caruana - 1997
13 papers in library cite
Thomas Wolf - 2019
6 papers in library cite
Z. Lan, Mark Chen, S. Goodman, Kevin Gimpel, P. Sharma, Radu Soricut - 2019
8 papers in library cite
P. Lewis, Ethan Perez, A. Piktus, F. Petroni, V. Karpukhin, N. Goyal, H. Kuttler, Martha Lewis, W. T. Yih, Tim Rocktaschel, Sebastian Riedel, K. Douwe - 2020
5 papers in library cite
Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, Dario Amodei - 2020
12 papers in library cite
J. Howard, Sebastian Ruder - 2018
14 papers in library cite
Z. Dai, Zhilin Yang, Yining Yang, W. Cohen, J. Carbonell, Quoc Le, Ruslan Salakhutdinov - 2019
9 papers in library cite
Peter Clark, Isaac Cowhey, Oren Etzioni, Tushar Khot, Ashish Sabharwal, Carissa Schoenick, Oyvind Tafjord - 2018
5 papers in library cite
Rowan Zellers, Ari Holtzman, Yonatan Bisk, Ali Farhadi, Yejin Choi - 2019
6 papers in library cite
T. Kwiatkowski, J. Palomaki, O. Rhinehart, Michael Collins, A. P. Parikh, C. Alberti, D. Epstein, Illia Polosukhin, M. Kelcey, Jacob Devlin, K. Lee, K. N. Toutanova, Llion Jones, M. W. Chang, Andrew Dai, Jakob Uszkoreit, Quoc Le, Slav Petrov - 2019
9 papers in library cite
P. Rajpurkar, R. Jia, Percy Liang - 2018
14 papers in library cite
M. Joshi, E. Choi, D. Weld, Luke Zettlemoyer - 2017
18 papers in library cite
Noam Shazeer, Azalia Mirhoseini, K. Maziarz, A. Davis, Quoc Le, Geoffrey Hinton, Jeffrey Dean - 2017
9 papers in library cite
A. Wang, Y. Pruksachatkun, Nikita Nangia, A. Singh, J. Michael, F. Hill, Omer Levy, Samuel R. Bowman - 2019
15 papers in library cite
Ido Dagan, O. Glickman, Bernardo Magnini - 2005
19 papers in library cite
M. Shoeybi, M. Patwary, Raul Puri, P. Legresley, J. Casper, Bryan Catanzaro - 2019
3 papers in library cite
Sepp Hochreiter, A. Steven Younger, Peter R. Conwell - 2001
4 papers in library cite
Geoffrey Irving - 2020
7 papers in library cite
Hector J. Levesque, E. Davis, Leora Morgenstern - 2011
13 papers in library cite
G. Lample, Alexis Conneau - 2019
5 papers in library cite
A. M. Dai, Quoc V. Le - 2015
27 papers in library cite
Guokun Lai, Q. Xie, Haozhe Liu, Yining Yang, Eduard Hovy - 2017
11 papers in library cite
R. T. Mccoy, Ellie Pavlick, Tal Linzen - 2019
5 papers in library cite
R. Jozefowicz, Oriol Vinyals, M. Schuster, Noam Shazeer, Yonghui Wu - 2016
20 papers in library cite
Siva Reddy, Deli Chen, Christopher D. Manning - 2018
6 papers in library cite
Suchin Gururangan, Swabha Swayamdipta, Omer Levy, Richard Schwartz, S. Bowman, Noah A. Smith - 2018
6 papers in library cite
B. Mccann, J. Bradbury, Caiming Xiong, Richard Socher - 2017
14 papers in library cite
P. J. Liu, M. Saleh, E. Pot, B. Goodrich, R. Sepassi, Lukasz Kaiser, Noam Shazeer - 2018
7 papers in library cite
Noam Shazeer - 2020
2 papers in library cite
D. Paperno, German Kruszewski, A. Lazaridou, N. Q. Pham, R. Bernardi, S. Pezzelle, M. Baroni, G. Boleda, Raquel Fernandez - 2016
12 papers in library cite
Richard Socher - 2018
9 papers in library cite
D. Yogatama, C. D. M. D'autume, J. Connor, T. Kocisky, M. Chrzanowski, L. Kong, A. Lazaridou, W. Ling, Longhui Yu, C. Dyer - 2019
2 papers in library cite
Colin Raffel, Noam Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, Wentao Li, P. J. Liu - 2019
17 papers in library cite
Oriol Vinyals, C. Blundell, T. Lillicrap, Daan Wierstra - 2016
2 papers in library cite
Ari Holtzman, J. Buys, L. Du, M. Forbes, Yejin Choi - 2019
5 papers in library cite
S. Ravi, Hugo Larochelle - 2017
2 papers in library cite
R. Sennrich, B. Haddow, Alexandra Birch - 2016
4 papers in library cite
K. Sakaguchi, R. L. Bras, C. Bhagavatula, Yejin Choi - 2019
4 papers in library cite
M. Mitchell, S. Wu, A. Zaldivar, P. Barnes, L. Vasserman, B. Hutchinson, E. Spitzer, I. D. Raji, T. Gebru - 2018
5 papers in library cite
Yonatan Bisk, Rowan Zellers, R. L. Bras, Jianfeng Gao, Yejin Choi - 2019
5 papers in library cite
Rewon Child, Scott Gray, Alec Radford, Ilya Sutskever - 2019
5 papers in library cite
K. Guu, K. Lee, Z. Tung, P. Panupong, M. W. Chang - 2020
5 papers in library cite
C. Clark, K. Lee, M. W. Chang, T. Kwiatkowski, Michael Collins, Kristina Toutanova - 2019
4 papers in library cite
Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao - 2019
6 papers in library cite
Rowan Zellers, Ari Holtzman, Hannah Rashkin, Yonatan Bisk, Ali Farhadi, F. Roesner, Yejin Choi - 2019
5 papers in library cite
J. Hestness, S. Narang, N. Ardalani, G. Diamos, Heewoo Jun, H. Kianinejad, M. Patwary, M. Ali, Yining Yang, Y. Zhou - 2017
5 papers in library cite
Y. Duan, John Schulman, X. Chen, P. L. Bartlett, Ilya Sutskever, P. Abbeel - 2016
2 papers in library cite
K. Song, X. Tan, T. Qin, J. Lu, T. Y. Liu - 2019
5 papers in library cite
E. Choi, He He, M. Iyyer, M. Yatskar, W. T. Yih, Yejin Choi, Percy Liang, Luke Zettlemoyer - 2018
8 papers in library cite
H. Gonen, Y. Goldberg - 2019
4 papers in library cite
T. H. Trinh, Quoc V. Le - 2018
4 papers in library cite
Sam McCandlish, Jared Kaplan, Dario Amodei, O. D. Team - 2018
3 papers in library cite
R. Rudinger, J. Naradowsky, B. Leonard, B. V. Durme - 2018
6 papers in library cite
Mostafa Dehghani, S. Gouws, Oriol Vinyals, Jakob Uszkoreit, Lukasz Kaiser - 2018
6 papers in library cite
D. Dua, Yuzhi Wang, P. Dasigi, G. Stanovsky, Shivalika Singh, Matt Gardner - 2019
4 papers in library cite
Yoshua Bengio, N. Leonard, Aaron Courville - 2013
3 papers in library cite
Y. C. Chen, Lei Li, Longhui Yu, A. E. Kholy, F. Ahmed, Z. Gan, Y. Cheng, Joseph Liu - 2019
1 paper in library cites
Daniel Khashabi, Tushar Khot, Ashish Sabharwal, Oyvind Tafjord, Peter Clark, Hananneh Hajishirzi - 2020
5 papers in library cite
Jonathan Berant, A. Chou, R. Frostig, Percy Liang - 2013
8 papers in library cite
S. L. Blodgett, S. Barocas, H. D. Iii, H. Wallach - 2020
7 papers in library cite
L. Bentivogli, Peter Clark, Ido Dagan, D. Giampiccolo - 2009
7 papers in library cite
D. Giampiccolo, Bernardo Magnini, Ido Dagan, B. Dolan - 2007
7 papers in library cite
T. Mihaylov, Peter Clark, Tushar Khot, Ashish Sabharwal - 2018
6 papers in library cite
N. Durrani, B. Haddow, P. Koehn, K. Heafield - 2014
6 papers in library cite
R. B. Haim, Ido Dagan, B. Dolan, L. Ferro, D. Giampiccolo, Bernardo Magnini, I. Szpektor - 2006
6 papers in library cite
J. S. Rosenfeld, A. Rosenfeld, Yonatan Belinkov, N. Shavit - 2019
5 papers in library cite
N. Mostafazadeh, N. Chambers, X. He, D. Parikh, D. Batra, L. Vanderwende, P. Kohli, J. Allen - 2016
5 papers in library cite
Y. Fyodorov, Y. Winter, N. Francez - 2000
4 papers in library cite
M. Roemmele, C. A. Bejan, A. S. Gordon - 2011
4 papers in library cite
A. Poliak, A. Haldar, R. Rudinger, J. E. Hu, Ellie Pavlick, A. S. White, B. V. Durme - 2018
4 papers in library cite
M. E. Peters, M. Neumann, Luke Zettlemoyer, W. T. Yih - 2018
4 papers in library cite
Daniel Khashabi, S. Chaturvedi, M. Roth, Shyam Upadhyay, Dan Roth - 2018
4 papers in library cite
S. Zhang, Xiaodong Liu, Joseph Liu, Jianfeng Gao, K. Duh, B. V. Durme - 2018
4 papers in library cite
Moin Nadeem, A. Bethke, Siva Reddy - 2020
4 papers in library cite
E. Sheng, K. W. Chang, P. Natarajan, Nanyun Peng - 2019
4 papers in library cite
M. Iyyer, J. B. Graber, L. Claudino, Richard Socher - 2014
3 papers in library cite
Y. Nie, A. Williams, E. Dinan, Mohit Bansal, Jason Weston, Douwe Kiela - 2019
3 papers in library cite
Yonatan Bisk, Ari Holtzman, J. Thomason, Jacob Andreas, Yoshua Bengio, J. Chai, Mirella Lapata, A. Lazaridou, J. May, A. Nisnevich - 2020
3 papers in library cite
Tal Linzen - 2020
3 papers in library cite
Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao - 2019
3 papers in library cite
M. Andrychowicz, M. Denil, S. Gomez, M. W. Hoffman, D. Pfau, T. Schaul, N. D. Freitas - 2016
3 papers in library cite
I. Solaiman, Miles Brundage, Jack Clark, Amanda Askell, Ariel Herbert-Voss, Jeffrey Wu, Alec Radford, Gretchen Krueger, J. W. Kim, S. Kreps, M. Mccain, A. Newhouse, J. Blazakis, K. Mcguffie, J. Wang - 2019
3 papers in library cite
Yoon Kim, A. Rush - 2016
3 papers in library cite
M. C. D. Marneffe, M. Simons, J. Tonhauser - 2019
3 papers in library cite
K. Crawford - 2017
3 papers in library cite
M. Post - 2018
2 papers in library cite
Alex Graves - 2016
2 papers in library cite
S. E. Kreps, M. Mccain, Miles Brundage - 2020
2 papers in library cite
Daphne Ippolito, D. Duckworth, Chris Callison Burch, D. Eck - 2019
2 papers in library cite
J. Tobin, R. Fong, Alex Ray, J. Schneider, Wojciech Zaremba, P. Abbeel - 2017
2 papers in library cite
Timo Schick, Hinrich Schutze - 2020
2 papers in library cite
Yibo Liu, J. Gu, N. Goyal, Xiang Lisa Li, S. Edunov, M. Ghazvininejad, Martha Lewis, Luke Zettlemoyer - 2020
2 papers in library cite
E. Loper, S. Bird - 2002
2 papers in library cite
T. Niven, H. Y. Kao - 2019
2 papers in library cite
Y. Qian, U. Muaz, B. Zhang, J. W. Hyun - 2019
2 papers in library cite
P. S. Huang, Haowei Zhang, R. Jiang, R. Stanforth, J. Welbl, J. Rae, Vishal Maini, D. Yogatama, P. Kohli - 2019
2 papers in library cite
Xiaodong Liu, Jianfeng Gao, X. He, K. Duh, Y. Y. Wang - 2015
2 papers in library cite
Jason Phang, T. Fevry, Samuel R. Bowman - 2018
2 papers in library cite
M. Marcus, G. Kim, Mary Ann Marcinkiewicz, R. Macintyre, A. Bies, M. Ferguson, K. Katz, B. Schasberger - 1994
2 papers in library cite
Q. Xie, Z. Dai, Eduard Hovy, M. T. Luong, Quoc V. Le - 2019
2 papers in library cite
S. Carey, E. Bartlett - 1978
1 paper in library cites
Xiaodong Liu, H. Cheng, Pengcheng He, Weizhu Chen, Yuzhi Wang, H. Poon, Jianfeng Gao - 2020
1 paper in library cites
Danny Hernandez, Tom B. Brown - 2020
1 paper in library cites
P. D. Turney, M. L. Littman, J. Bigham, V. Shnayder - 2003
1 paper in library cites
P. D. Turney, M. L. Littman - 2005
1 paper in library cites
M. Nissim, R. V. Noord, R. V. D. Goot - 2019
1 paper in library cites
S. Reed, Yanru Chen, T. Paine, A. V. D. Oord, S. Eslami, D. Rezende, Oriol Vinyals, N. D. Freitas - 2017
1 paper in library cites
Sebastian Gehrmann, Hendrik Strobelt, Alexander M. Rush - 2019
1 paper in library cites
R. S. Ross - 2012
1 paper in library cites
D. Mackay - 1992
1 paper in library cites
K. Li, Jitendra Malik - 2017
1 paper in library cites
R. Aharoni, M. J. Johnson, O. Firat - 2019
1 paper in library cites
J. Gu, Yuzhi Wang, Yanru Chen, Kyunghyun Cho, V. O. Li - 2018
1 paper in library cites
Yuzhi Wang, Y. Xia, T. He, F. Tian, T. Qin, C. Zhai, T. Y. Liu - 2019
1 paper in library cites
P. Norvig - 2009
1 paper in library cites
Z. Junyuan, G. L. Nyc - 2020
1 paper in library cites
Q. Ran, Yutong Lin, P. L. Li, Jingren Zhou, Ze Liu - 2019
1 paper in library cites
Dan Hendrycks, Xiaodong Liu, E. Wallace, A. Dziedzic, R. Krishnan, Dawn Song - 2020
1 paper in library cites
S. Baccianella, A. Esuli, F. Sebastiani - 2010
1 paper in library cites
Zhiyuan Li, X. Ding, T. Liu - 2019
1 paper in library cites
Y. Ju, F. Zhao, S. Chen, Bo Zheng, X. Yang, Yibo Liu - 2019
1 paper in library cites
X. Jiao, Y. Yin, L. Shang, Xu Jiang, X. Chen, Lei Li, Feng Wang, Qian Liu - 2019
1 paper in library cites
Zhiyuan Li, E. Wallace, S. Shen, K. Lin, Kurt Keutzer, Dan Klein, Joseph E. Gonzalez - 2020
1 paper in library cites
S. C. Lin, J. H. Yang, Rodrigo Nogueira, M. F. Tsai, C. J. Wang, Junyang Lin - 2020
1 paper in library cites
S. Edunov, M. Ott, Michael Auli, D. Grangier - 2018
1 paper in library cites
M. T. Pilehvar, J. C. Collados - 2018
1 paper in library cites
Cited by
21
papers in your library
Cites
69
papers in your library
Read
on May 23, 2026
Your review
Tags
Paper Aliases
No aliases