2024
Cite Score
76
AI summary
This paper introduces DeepSeekMath 7B, a language model pre-trained on 120B math-related tokens, achieving 51.7% on MATH benchmark without external tools and introduces Group Relative Policy Optimization (GRPO) for enhanced mathematical reasoning.
Main Contributions
Abstract
Mathematical reasoning poses a significant challenge for language models due to its complex and structured nature. In this paper, we introduce DeepSeekMath 7B, which continues pre-training DeepSeek-Coder-Base-v1.5 7B with 120B math-related tokens sourced from Common Crawl, together with natural language and code data. DeepSeekMath 7B has achieved an impressive score of 51.7% on the competition-level MATH benchmark without relying on external toolkits and voting techniques, approaching the performance level of Gemini-Ultra and GPT-4. Self-consistency over 64 samples from DeepSeekMath 7B achieves 60.9% on MATH. The mathematical reasoning capability of DeepSeekMath is attributed to two key factors: First, we harness the significant potential of publicly available web data through a meticulously engineered data selection pipeline. Second, we introduce Group Relative Policy Optimization (GRPO), a variant of Proximal Policy Optimization (PPO), that enhances mathematical reasoning abilities while concurrently optimizing the memory usage of PPO.
Citation Graph
References [59]
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, Oleg Klimov - 2017
10 papers in library cite
I. Loshchilov, Frank Hutter - 2017
7 papers in library cite
Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, C. Wainwright, Pamela Mishkin, Chiyuan Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul F. Christiano, Jan Leike, Ryan Lowe - 2022
11 papers in library cite
Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian, Clemens Winter, Philippe Tillet, Felipe Petroski Such, Dave Cummings, Matthias Plappert, Fotios Chantzis, Elizabeth Barnes, Ariel Herbert-Voss, William Hebgen Guss, Alex Nichol, Alex Paino, Nikolas Tezak, Jie Tang, Igor Babuschkin, Suchir Balaji, Shantanu Jain, William Saunders, Christopher Hesse, Andrew N. Carr, Jan Leike, Josh Achiam, Vedant Misra, Evan Morikawa, Alec Radford, Matthew Knight, Miles Brundage, Mira Murati, Katie Mayer, Peter Welinder, Bob McGrew, Dario Amodei, Sam McCandlish, Ilya Sutskever, Wojciech Zaremba - 2021
9 papers in library cite
Rafael Rafailov, Archit Sharma, Eric Mitchell, Stefano Ermon, Christopher D. Manning, Chelsea Finn - 2023
3 papers in library cite
Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, Jacob Steinhardt - 2021
6 papers in library cite
Karl Cobbe, Vineet Kosaraju, Mohammad Bavarian, Mark Chen, Heewoo Jun, Lukasz Kaiser, Matthias Plappert, Jerry Tworek, Jacob Hilton, Reiichiro Nakano, Christopher Hesse, John Schulman - 2021
7 papers in library cite
Dan Hendrycks, Collin Burns, Saurav Kadavath, Akul Arora, Steven Basart, Eric Tang, Dawn Song, Jacob Steinhardt - 2021
8 papers in library cite
Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Michael Terry, Quoc Le - 2021
4 papers in library cite
Hunter Lightman, Vineet Kosaraju, Yuri Burda, Harri Edwards, Bowen Baker, Teddy Lee, Jan Leike, John Schulman, Ilya Sutskever, Karl Cobbe - 2023
4 papers in library cite
Mirac Suzgun, Nathan Scales, Nathanael Scharli, Sebastian Gehrmann, Yi Tay, Hyung Won Chung, Aakanksha Chowdhery, Quoc V. Le, Ed H. Chi, Denny Zhou, Jason Wei - 2022
4 papers in library cite
Peng Wang, Lei Li, Zhihong Shao, Runxin Xu, Damai Dai, Yiwei Li, Deli Chen, Yonghui Wu, Zhifang Sui - 2023
1 paper in library cites
Jason Wei, Xinpeng Wang, Dale Schuurmans, Maarten Bosma, Fanyue Xia, E. Chi, Quoc V. Le, Denny Zhou - 2022
10 papers in library cite
Openai - 2023
6 papers in library cite
R. Anil, S. Borgeaud, Yonghui Wu, J. Alayrac, J. Yu, Radu Soricut, J. Schalkwyk, A. M. Dai, A. Hauth, K. Millican, D. Silver, Slav Petrov, M. J. Johnson, I. Antonoglou, J. Schrittwieser, A. Glaese, Jixuan Chen, E. Pitler, T. P. Lillicrap, A. Lazaridou, O. Firat, J. Molloy, M. Isard, P. R. Barham, T. Hennigan, B. Lee, F. Viola, M. Reynolds, Yiheng Xu, R. Doherty, E. Collins, C. M. Meyer, E. Rutherford, E. Moreira, K. Ayoub, M. Goel, G. Tucker, E. Piqueras, M. Krikun, I. Barr, N. Savinov, Ivo Danihelka, B. Roelofs, A. White, Anders Andreassen, T. V. Glehn, L. Yagati, M. Kazemi, L. Gonzalez, M. Khalman, J. Sygnowski - 2023
1 paper in library cites
W. Kwon, Zhiyuan Li, S. Zhuang, Y. Sheng, L. Zheng, C. H. Yu, Joseph Gonzalez, Haowei Zhang, Ion Stoica - 2023
5 papers in library cite
A. Q. Jiang, A. Sablayrolles, A. Mensch, C. Bamford, D. S. Chaplot, D. D. 1. Casas, F. Bressand, G. Lengyel, G. Lample, L. Saulnier - 2023
2 papers in library cite
Armand Joulin, E. Grave, Piotr Bojanowski, M. Douze, Hervé Jégou, Tomas Mikolov - 2016
1 paper in library cites
Aitor Lewkowycz, Anders Andreassen, David Dohan, Ethan Dyer, Henryk Michalewski, Vinay Ramasesh, Ambrose Slone, C. Anil, I. Schlag, T. G. Solo - 2022
3 papers in library cite
Hugo Touvron, L. Martin, K. Stone, P. Albert, Amjad Almahairi, Y. Babaei, N. Bashlykov, S. Batra, P. Bhargava, S. Bhosale, D. Bikel, L. Blecher, C. C. Ferrer, Mark Chen, G. Cucurull, D. Esiobu, J. Fernandes, J. Fu, W. Fu, B. Fuller, C. Gao, V. Goswami, N. Goyal, A. Hartshorn, S. Hosseini, R. Hou, H. Inan, M. Kardas, V. Kerkez, M. Khabsa, I. Kloumann, A. Korenev, P. S. Koura, M. Lachaux, T. Lavril, Jaehoon Lee, D. Liskovich, Y. Lu, Y. Mao, X. Martinet, T. Mihaylov, P. Mishra, I. Molybog, Y. Nie, A. Poulton, J. Reizenstein, R. Rungta, K. Saladi, A. Schelten, R. Silva, E. M. Smith, R. Subramanian, X. E. Tan, B. Tang, R. Taylor, A. Williams, J. X. Kuan, P. Xu, Zhicheng Yan, I. Zarov, Y. Z. Zhang, A. Fan, M. Kambadur, S. Narang, A. Rodriguez, R. Stojnic, S. Edunov, T. Scialom - 2023
3 papers in library cite
S. Yao, D. Yu, J. Zhao, I. Shafran, T. L. Griffiths, Yue Cao, K. Narasimhan - 2023
2 papers in library cite
Deepseek Ai - 2024
1 paper in library cites
Daniel Guo, Qihao Zhu, Diyi Yang, Z. Xie, K. Dong, Wenxuan Zhang, Guanduo Chen, Xiao Bi, Yonghui Wu, Y. K. Li, F. Luo, Yunyang Xiong, W. Liang - 2024
1 paper in library cites
Weizhu Chen, X. Ma, Xinpeng Wang, W. W. Cohen - 2022
1 paper in library cites
John Schulman, P. Moritz, Sergey Levine, M. Jordan, P. Abbeel - 2015
5 papers in library cite
S. Polu, Ilya Sutskever - 2020
3 papers in library cite
Zhihao Yuan, H. Yuan, Chun-Liang Li, G. Dong, C. Tan, Chang Zhou - 2023
3 papers in library cite
Z. Gou, Zhihong Shao, Y. Gong, Y. Shen, Yining Yang, M. Huang, N. Duan, Weizhu Chen - 2023
3 papers in library cite
John Schulman - 2020
2 papers in library cite
Y. Leviathan, M. Kalman, Y. Matias - 2023
2 papers in library cite
F. Shi, Mirac Suzgun, M. Freitag, Xinpeng Wang, S. Srivats, S. Vosoughi, Hyung Won Chung, Yi Tay, Sebastian Ruder, Denny Zhou, Dipanjan Das, Jason Wei - 2023
2 papers in library cite
Z. Azerbayev, H. Schoelkopf, K. Paster, M. D. Santos, S. M. Mcaleer, A. Q. Jiang, J. Deng, Stella Biderman, S. Welleck - 2023
2 papers in library cite
Peng Wang, Lei Li, L. C. Chen, Francis Song, B. Lin, Yue Cao, T. Liu, Zhifang Sui - 2023
2 papers in library cite
Xiang Yue, X. Qu, G. Zhang, Y. Fu, Weixiao Huang, Huan Sun, Yu Su, Weizhu Chen - 2023
2 papers in library cite
Longhui Yu, W. Jiang, H. Shi, J. Yu, Ze Liu, Y. Z. Zhang, J. T. Kwok, Zhiyuan Li, A. Weller, Weizhou Liu - 2023
2 papers in library cite
T. H. Trinh, Yonghui Wu, Quoc V. Le, He He, T. Luong - 2024
2 papers in library cite
H. Luo, Q. Sun, Chenfeng Xu, P. Zhao, J. Lou, C. Tao, X. Geng, Q. Lin, S. Chen, Danyang Zhang - 2023
2 papers in library cite
W. Zhong, R. Cui, Y. Guo, Yiqing Liang, S. Lu, Yuzhi Wang, A. Saied, Weizhu Chen, N. Duan - 2023
1 paper in library cites
C. Team - 2023
1 paper in library cites
T. Wei, J. Luan, Weizhou Liu, S. Dong, B. Wang - 2023
1 paper in library cites
A. Q. Jiang, S. Welleck, J. P. Zhou, Wentao Li, Joseph Liu, M. Jamnik, T. Lacroix, Yonghui Wu, G. Lample - 2022
1 paper in library cites
T. Tao - 2023
1 paper in library cites
Zhengtao Wang, R. Xia, P. Liu - 2023
1 paper in library cites
Z. Du, Y. Qian, Xiaodong Liu, M. Ding, Jiezhong Qiu, Zhilin Yang, Jie Tang - 2022
1 paper in library cites
H. Flyer - 2023
1 paper in library cites
I. Ai - 2023
1 paper in library cites
Swaroop Mishra, M. Finlayson, P. Lu, L. Tang, S. Welleck, Chitta Baral, T. Rajpurohit, Oyvind Tafjord, Ashish Sabharwal, Peter Clark, A. Kalyan - 2022
1 paper in library cites
K. Zheng, J. M. Han, S. Polu - 2021
1 paper in library cites
K. Paster, M. D. Santos, Z. Azerbayev, Jimmy Lei Ba - 2023
1 paper in library cites
Leo Gao, Aman Madaan, Shuyan Zhou, U. Alon, P. Liu, Yining Yang, J. Callan, Graham Neubig - 2023
1 paper in library cites
Francis Song, B. Yu, M. Li, H. Yu, F. Huang, Yiwei Li, Haiming Wang - 2023
1 paper in library cites
T. Computer - 2023
1 paper in library cites
Zhihao Yuan, H. Yuan, C. Tan, Wenyi Wang, S. Huang, F. Huang - 2023
1 paper in library cites
X. Nguyen, Wenxuan Zhang, Xiang Lisa Li, M. M. Aljunied, Q. Tan, L. C. Cheng, Guanduo Chen, Y. Deng, Shusheng Yang, C. L. Liu, Haowei Zhang, L. Bing - 2023
1 paper in library cites
H. Xia, Tiezheng Ge, Peng Wang, S. Q. Chen, F. Wei, Zhifang Sui - 2023
1 paper in library cites
M. Wenzel, L. C. Paulson, T. Nipkow - 2008
1 paper in library cites
L. C. Paulson - 2010
1 paper in library cites
H. Xia, Zhilin Yang, Q. Dong, Peng Wang, Yiwei Li, Tiezheng Ge, T. Liu, Wentao Li, Zhifang Sui - 2024
1 paper in library cites
Collin Burns, P. Izmailov, J. H. Kirchner, Bowen Baker, Leo Gao, L. Aschenbrenner, Yanru Chen, A. Ecoffet, M. Joglekar, Jan Leike - 2023
1 paper in library cites
Cited by
3
papers in your library
Cites
23
papers in your library
Read
on June 1, 2026
Your review
Tags
Paper Aliases
No aliases