2024
Training Language Models to Self-Correct via Reinforcement Learning
A. Kumar, V. Zhuang, R. Agarwal, Yu Su, J. D. C. Reyes, A. Singh, K. Baumli, S. Iqbal, C. Bishop, R. Roelofs
Citation Graph
References [0]
No references match the current filters.
Cited by
1
papers in your library
Cites
0
Add to reading list
Notes
Tags
Paper Aliases
No aliases