2025

Deepseek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Deepseek Ai

citations

Cite Score

83

AI summary

This paper introduces DeepSeek-R1, an LLM trained with a novel reinforcement learning framework using Group Relative Policy Optimization (GRPO) that incentivizes emergent reasoning patterns like self-reflection and dynamic strategy adaptation, achieving superior performance on math, coding, and STEM tasks without human-labeled reasoning trajectories.

Main Contributions

  • Introduces DeepSeek-R1, an LLM whose reasoning abilities are incentivized through pure reinforcement learning (RL), obviating the need for human-labeled reasoning trajectories.
  • Proposes an RL framework that facilitates the emergent development of advanced reasoning patterns, such as self-reflection, verification, and dynamic strategy adaptation.
  • The trained DeepSeek-R1 model achieves superior performance on verifiable tasks (mathematics, coding competitions, STEM fields) compared to models trained via conventional supervised learning.
  • Demonstrates that emergent reasoning patterns in large-scale models can guide and enhance reasoning capabilities in smaller models.
  • Achieves an average pass@1 score of 77.9% on AIME 2024, significantly outperforming human participants and other models.

Abstract

General reasoning represents a long-standing and formidable challenge in artificial intelligence. Recent breakthroughs, exemplified by large language models (LLMs) (Brown et al., 2020; OpenAI, 2023) and chain-of-thought prompting (Wei et al., 2022b), have achieved considerable success on foundational reasoning tasks. However, this success is heavily contingent upon extensive human-annotated demonstrations, and models’ capabilities are still insufficient for more complex problems. Here we show that the reasoning abilities of LLMs can be incentivized through pure reinforcement learning (RL), obviating the need for human-labeled reasoning trajectories. The proposed RL framework facilitates the emergent development of advanced reasoning patterns, such as self-reflection, verification, and dynamic strategy adaptation. Consequently, the trained model achieves superior performance on verifiable tasks such as mathematics, coding competitions, and STEM fields, surpassing its counterparts trained via conventional supervised learning on human demonstrations. Moreover, the emergent reasoning patterns exhibited by these large-scale models can be systematically harnessed to guide and enhance the reasoning capabilities of smaller models.

Citation Graph

Loading graph...

References [90]

Sort:
Filter:

Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei - 2020

21 papers in library cite

John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, Oleg Klimov - 2017

10 papers in library cite

Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, C. Wainwright, Pamela Mishkin, Chiyuan Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul F. Christiano, Jan Leike, Ryan Lowe - 2022

11 papers in library cite

Jeffrey Dean - 2015

6 papers in library cite

Alec Radford, Jeffrey Wu, Rewon Child, D. Luan, Dario Amodei, Ilya Sutskever - 2019

27 papers in library cite

Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian, Clemens Winter, Philippe Tillet, Felipe Petroski Such, Dave Cummings, Matthias Plappert, Fotios Chantzis, Elizabeth Barnes, Ariel Herbert-Voss, William Hebgen Guss, Alex Nichol, Alex Paino, Nikolas Tezak, Jie Tang, Igor Babuschkin, Suchir Balaji, Shantanu Jain, William Saunders, Christopher Hesse, Andrew N. Carr, Jan Leike, Josh Achiam, Vedant Misra, Evan Morikawa, Alec Radford, Matthew Knight, Miles Brundage, Mira Murati, Katie Mayer, Peter Welinder, Bob McGrew, Dario Amodei, Sam McCandlish, Ilya Sutskever, Wojciech Zaremba - 2021

9 papers in library cite

Rafael Rafailov, Archit Sharma, Eric Mitchell, Stefano Ermon, Christopher D. Manning, Chelsea Finn - 2023

3 papers in library cite

Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, Jacob Steinhardt - 2021

6 papers in library cite

Karl Cobbe, Vineet Kosaraju, Mohammad Bavarian, Mark Chen, Heewoo Jun, Lukasz Kaiser, Matthias Plappert, Jerry Tworek, Jacob Hilton, Reiichiro Nakano, Christopher Hesse, John Schulman - 2021

7 papers in library cite

Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, Dario Amodei - 2020

12 papers in library cite

Paul Christiano, Jan Leike, Tom B. Brown, Miljan Martic, Shane Legg, Dario Amodei - 2017

11 papers in library cite

Zhihong Shao, Peng Wang, Qihao Zhu, Runxin Xu, Junxiao Song, Xiao Bi, Haowei Zhang, Mingchuan Zhang, Yiwei Li, Yonghui Wu - 2024

3 papers in library cite

Missing author list

2022

4 papers in library cite

Hunter Lightman, Vineet Kosaraju, Yuri Burda, Harri Edwards, Bowen Baker, Teddy Lee, Jan Leike, John Schulman, Ilya Sutskever, Karl Cobbe - 2023

4 papers in library cite

Reiichiro Nakano, Jacob Hilton, Suchir Balaji, Jeffrey Wu, Long Ouyang, Christina Kim, Christopher Hesse, Shantanu Jain, Vineet Kosaraju, William Saunders, Xu Jiang, Karl Cobbe, Tyna Eloundou, Gretchen Krueger, Kevin Button, Matthew Knight, Benjamin Chess, John Schulman - 2021

7 papers in library cite

Mirac Suzgun, Nathan Scales, Nathanael Scharli, Sebastian Gehrmann, Yi Tay, Hyung Won Chung, Aakanksha Chowdhery, Quoc V. Le, Ed H. Chi, Denny Zhou, Jason Wei - 2022

4 papers in library cite

Leo Gao, John Schulman, Jacob Hilton - 2022

3 papers in library cite

Jonathan Uesato, Nate Kushman, Ramana Kumar, Francis Song, Noah Siegel, Lisa Wang, Antonia Creswell, Geoffrey Irving, Irina Higgins - 2022

4 papers in library cite

Openai - 2023

6 papers in library cite

S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. R. Narasimhan, Yue Cao - 2022

3 papers in library cite

W. Kwon, Zhiyuan Li, S. Zhuang, Y. Sheng, L. Zheng, C. H. Yu, Joseph Gonzalez, Haowei Zhang, Ion Stoica - 2023

5 papers in library cite

Hyung Won Chung, L. Hou, S. Longpre, Barret Zoph, Yi Tay, William Fedus, Yiwei Li, Xinpeng Wang, Mostafa Dehghani, S. Brahma, A. Webson, Shixiang Shane Gu, Z. Dai, Mirac Suzgun, X. Chen, Aakanksha Chowdhery, A. C. Ros, M. Pellat, K. Robinson, D. Valter, S. Narang, Gaurav Mishra, A. Yu, V. Zhao, Y. Huang, Andrew Dai, H. Yu, Slav Petrov, Ed H. Chi, Jeffrey Dean, Jacob Devlin, A. Roberts, Denny Zhou, Quoc V. Le, Jason Wei - 2022

2 papers in library cite

T. Kojima, Shixiang Shane Gu, M. Reid, Y. Matsuo, Y. Iwasawa - 2022

6 papers in library cite

Jason Wei, Yi Tay, R. Bommasani, Colin Raffel, Barret Zoph, S. Borgeaud, D. Yogatama, Maarten Bosma, Denny Zhou, D. Metzler, Ed H. Chi, Tatsunori Hashimoto, Oriol Vinyals, Percy Liang, Jeffrey Dean, William Fedus - 2022

2 papers in library cite

Deepseek Ai - 2024

1 paper in library cites

Xinpeng Wang, Jason Wei, Dale Schuurmans, Quoc Le, E. Chi, Denny Zhou - 2022

5 papers in library cite

D. Rein, B. L. Hou, A. C. Stickland, J. Petty, R. Y. Pang, J. Dirani, J. Michael, Samuel R. Bowman - 2024

3 papers in library cite

Yuzhi Wang, X. Ma, G. Zhang, Yuan Ni, A. Chandra, S. Guo, W. Ren, A. Arulraj, X. He, Zhejun Jiang, Tao Li, M. Ku, K. Wang, A. Zhuang, R. Fan, Xiang Yue, Weizhu Chen - 2024

3 papers in library cite

N. Jain, K. Han, Albert Gu, Wentao Li, F. Yan, Tong Zhang, Shijie Wang, A. S. Lezama, Koushik Sen, Ion Stoica - 2024

1 paper in library cites

C. Snell, Jaehoon Lee, K. Xu, A. Kumar - 2024

4 papers in library cite

Jingren Zhou, T. Lu, Swaroop Mishra, S. Brahma, S. Basu, Y. Luan, Denny Zhou, L. Hou - 2023

2 papers in library cite

B. Brown, J. Juravsky, R. Ehrlich, R. Clark, Quoc V. Le, C. Re, Azalia Mirhoseini - 2024

2 papers in library cite

A. P. Gema, J. O. J. Leang, G. Hong, A. Devoto, A. C. M. Mancino, R. Saxena, X. He, Y. Zhao, X. Du, M. R. G. Madani, C. Barale, R. Mchardy, J. Harris, J. Kaddour, E. V. Krieken, P. Minervini - 2025

1 paper in library cites

Hugo Touvron, L. Martin, K. Stone, P. Albert, Amjad Almahairi, Y. Babaei, N. Bashlykov, S. Batra, P. Bhargava, S. Bhosale, D. Bikel, L. Blecher, C. C. Ferrer, Mark Chen, G. Cucurull, D. Esiobu, J. Fernandes, J. Fu, W. Fu, B. Fuller, C. Gao, V. Goswami, N. Goyal, A. Hartshorn, S. Hosseini, R. Hou, H. Inan, M. Kardas, V. Kerkez, M. Khabsa, I. Kloumann, A. Korenev, P. S. Koura, M. Lachaux, T. Lavril, Jaehoon Lee, D. Liskovich, Y. Lu, Y. Mao, X. Martinet, T. Mihaylov, P. Mishra, I. Molybog, Y. Nie, A. Poulton, J. Reizenstein, R. Rungta, K. Saladi, A. Schelten, R. Silva, E. M. Smith, R. Subramanian, X. E. Tan, B. Tang, R. Taylor, A. Williams, J. X. Kuan, P. Xu, Zhicheng Yan, I. Zarov, Y. Z. Zhang, A. Fan, M. Kambadur, S. Narang, A. Rodriguez, R. Stojnic, S. Edunov, T. Scialom - 2023

3 papers in library cite

Wei-Lin Chiang, L. Zheng, Y. Sheng, A. N. Angelopoulos, Tao Li, Dustin Li, B. Zhu, Haowei Zhang, M. Jordan, Joseph E. Gonzalez - 2024

2 papers in library cite

S. Yao, D. Yu, J. Zhao, I. Shafran, T. L. Griffiths, Yue Cao, K. Narasimhan - 2023

2 papers in library cite

Qwen - 2024

1 paper in library cites

D. Silver, J. Schrittwieser, K. Simonyan, I. Antonoglou, A. Huang, A. Guez, T. Hubert, L. Baker, M. Lai, A. Bolton, Yanru Chen, T. Lillicrap, F. Hui, L. Sifre, G. V. D. Driessche, T. Graepel, Demis Hassabis - 2017

7 papers in library cite

John Schulman, P. Moritz, Sergey Levine, M. Jordan, P. Abbeel - 2015

5 papers in library cite

D. Silver, T. Hubert, J. Schrittwieser, I. Antonoglou, M. Lai, A. Guez, M. Lanctot, L. Sifre, D. Kumaran, T. Graepel, T. Lillicrap, K. Simonyan, Demis Hassabis - 2017

5 papers in library cite

Alicia Parrish, Anna Chen, Nikita Nangia, Vishakh Padmakumar, Jason Phang, Jana Thompson, Phu Mon Htut, S. Bowman - 2022

4 papers in library cite

Zhihao Yuan, H. Yuan, Chun-Liang Li, G. Dong, C. Tan, Chang Zhou - 2023

3 papers in library cite

E. Zelikman, Yonghui Wu, J. Mu, N. Goodman - 2022

3 papers in library cite

Z. Gou, Zhihong Shao, Y. Gong, Y. Shen, Yining Yang, M. Huang, N. Duan, Weizhu Chen - 2023

3 papers in library cite

John Schulman - 2020

2 papers in library cite

D. Busbridge, A. Shidani, F. Weers, J. Ramapuram, E. Littwin, R. Webb - 2025

2 papers in library cite

Denny Zhou, Nathanael Scharli, L. Hou, Jason Wei, Nathan Scales, Xinpeng Wang, Dale Schuurmans, C. Cui, O. Bousquet, Quoc V. Le, Ed H. Chi - 2023

2 papers in library cite

Deep Ganguli, Liane Lovitt, Jackson Kernion, Amanda Askell, Yuntao Bai, Saurav Kadavath, Benjamin Mann, Ethan Perez, Nicholas Schiefer, Kamal Ndousse, Andy Jones, S. Bowman, Anna Chen, Tom Conerly, Nova Dassarma, Dawn Drain, Nelson Elhage, Sheer El Showk, Stanislav Fort, Zac Hatfield Dodds, Tom Henighan, Danny Hernandez, Tristan Hume, J. Jacobson, Scott Johnston, Shauna Kravec, Catherine Olsson, Sam Ringer, Eli Tran Johnson, Dario Amodei, Tom B. Brown, Nicholas Joseph, Sam McCandlish, Christopher Olah, Jared Kaplan, Jack Clark - 2022

2 papers in library cite

Niklas Muennighoff, Alexander M. Rush, B. Barak, T. L. Scao, A. Piktus, N. Tazi, S. Pyysalo, Thomas Wolf, Colin Raffel - 2023

2 papers in library cite

J. Geiping, S. Mcleish, N. Jain, J. Kirchenbauer, Shivalika Singh, B. R. Bartoldson, B. Kailkhura, A. Bhatele, T. Goldstein - 2025

2 papers in library cite

Aman Madaan, N. Tandon, P. Gupta, S. Hallinan, Leo Gao, S. Wiegreffe, U. Alon, N. Dziri, S. Prabhumoye, Yining Yang, S. Gupta, Bodhisattwa Prasad Majumder, K. Hermann, S. Welleck, A. Yazdanbakhsh, Peter Clark - 2023

2 papers in library cite

T. H. Trinh, Yonghui Wu, Quoc V. Le, He He, T. Luong - 2024

2 papers in library cite

Timo Schick, J. D. Yu, R. Dessi, Roberta Raileanu, M. Lomeli, Eric Hambro, Luke Zettlemoyer, N. Cancedda, T. Scialom - 2023

2 papers in library cite

C. S. Xia, Y. Deng, S. Dunn, Li Zhang - 2024

1 paper in library cites

P. Gauthier - 2025

1 paper in library cites

X. Feng, Z. Wan, M. Wen, S. M. Mcaleer, Y. Wen, Wenxuan Zhang, J. Wang - 2024

1 paper in library cites

Maa - 2024

1 paper in library cites

Ziru Chen, Y. Min, B. Zhang, Jixuan Chen, J. J. Jiang, D. Cheng, W. X. Zhao, Ze Liu, X. Miao, Y. Lu - 2025

1 paper in library cites

F. Gloeckle, B. Y. Idrissi, B. Roziere, D. L. Paz, Gabriel Synnaeve - 2024

1 paper in library cites

A. Singh, J. D. C. Reyes, R. Agarwal, Akshay Anand, Piyush Patil, X. Garcia, P. J. Liu, J. Harrison, Jaehoon Lee, K. Xu, A. T. Parisi, A. Kumar, A. A. Alemi, A. Rizkowsky, A. Nova, B. Adlam, B. Bohnet, G. F. Elsayed, H. Sedghi, I. Mordatch, I. Simpson, I. Gur, J. Snoek, Jeffrey Pennington, J. Hron, K. Kenealy, K. Swersky, Khyati Mahajan, L. A. Culp, L. Xiao, M. Bileschi, Noah Constant, Roman Novak, Rosanne Liu, T. Warkentin, Y. Bansal, Ethan Dyer, Behnam Neyshabur, Jascha Sohl Dickstein, Noah Fiedel - 2024

1 paper in library cites

Y. Huang, Yuntao Bai, Z. Zhu, J. Zhang, J. Zhang, T. Su, Joseph Liu, C. Lv, Y. Z. Zhang, J. Lei, Y. Fu, Maosong Sun, J. He - 2023

1 paper in library cites

Cms - 2024

1 paper in library cites

Yun He, Shanda Li, Joseph Liu, Y. Tan, Wenyi Wang, H. Huang, Xingyuan Bu, H. Guo, Changran Hu, Bo Zheng - 2024

1 paper in library cites

H. Li, Y. Z. Zhang, F. Koto, Yining Yang, H. Zhao, Y. Gong, N. Duan, T. Baldwin - 2024

1 paper in library cites

M. Mirzayanov - 2025

1 paper in library cites

Jeffrey Li, Daniel Guo, Diyi Yang, Runxin Xu, Yonghui Wu, J. He - 2025

1 paper in library cites

Z. Gou, Zhihong Shao, Y. Gong, Y. Shen, Yining Yang, N. Duan, Weizhu Chen - 2024

1 paper in library cites

H. Xin, Z. Z. Ren, Junxiao Song, Zhihong Shao, W. Zhao, Haiming Wang, Bing Liu, Li Zhang, X. Lu, Q. Du, W. Gao, Qihao Zhu, Diyi Yang, Z. Gou, Z. F. Wu, F. Luo, C. Ruan - 2024

1 paper in library cites

Deepseek Ai - 2024

1 paper in library cites

Yuzhi Wang, H. Li, X. Han, P. Nakov, T. Baldwin - 2023

1 paper in library cites

S. Krishna, K. Krishna, A. Mohananey, S. Schwarcz, A. Stambler, Shyam Upadhyay, Manaal Faruqui - 2024

1 paper in library cites

S. Welleck, X. Lu, P. West, F. Brahman, T. Shen, Daniel Khashabi, Yejin Choi - 2023

1 paper in library cites

Mantas Mazeika, L. Phan, X. Yin, Andy Zou, Zhengtao Wang, N. Mu, E. Sakhaee, N. Li, Steven Basart, Boxuan Li, D. A. Forsyth, Dan Hendrycks - 2024

1 paper in library cites

Openai - 2024

1 paper in library cites

Ai@meta - 2024

1 paper in library cites

Peng Wang, Lei Li, Zhihong Shao, Runxin Xu, Damai Dai, Yiwei Li, Deli Chen, Yonghui Wu, Zhifang Sui - 2023

1 paper in library cites

H. Face - 2025

1 paper in library cites

E. Zelikman, G. R. Harik, Y. Shao, V. Jayasiri, N. Haber, N. Goodman - 2024

1 paper in library cites

Qwen - 2024

1 paper in library cites

S. Hao, Y. Gu, H. Ma, J. J. Hong, Zhengtao Wang, D. Z. Wang, Z. Hu - 2023

1 paper in library cites

C. V. Snell, Jaehoon Lee, K. Xu, A. Kumar - 2025

1 paper in library cites

B. Vidgen, H. R. Kirk, R. Qian, N. Scherrer, A. Kannappan, S. A. Hale, P. Rottger - 2023

1 paper in library cites

Y. S. Sun, Xinpeng Wang, Ze Liu, John Miller, A. Efros, Moritz Hardt - 2020

1 paper in library cites

E. Akyurek, M. Damani, L. Qiu, H. Guo, Yoon Kim, Jacob Andreas - 2024

1 paper in library cites

Ze Liu, C. C. Chen, Wentao Li, T. Pang, C. Du, M. Lin - 2025

1 paper in library cites

J. Pan, J. Zhang, Xinpeng Wang, Lifan Yuan, H. Peng, A. Suhr - 2025

1 paper in library cites

A. Kumar, V. Zhuang, R. Agarwal, Yu Su, J. D. C. Reyes, A. Singh, K. Baumli, S. Iqbal, C. Bishop, R. Roelofs - 2024

1 paper in library cites

P. Rottger, H. Kirk, B. Vidgen, G. Attanasio, F. Bianchi, D. Hovy - 2024

1 paper in library cites

Bill Yuchen Lin - 2024

1 paper in library cites

Cited by

2

papers in your library

Cites

37

papers in your library

Read

on June 1, 2026

Your review

Tags

Vetto Study

Paper Aliases

No aliases