2024
Back to Basics: Revisiting Reinforce Style Optimization for Learning From Human Feedback in LLMS
A. Ahmadian, C. Cremer, M. Galle, M. Fadaee, J. Kreutzer, O. Pietquin, A. Ustun, S. Hooker
Cite Score
27
Citation Graph
References [0]
No references match the current filters.
Cited by
1
papers in your library
Cites
0
In reading list
Notes
Tags
Paper Aliases
No aliases