Papperoni

2024

Back to Basics: Revisiting Reinforce Style Optimization for Learning From Human Feedback in LLMS

A. Ahmadian, C. Cremer, M. Galle, M. Fadaee, J. Kreutzer, O. Pietquin, A. Ustun, S. Hooker

citations

Cite Score

Citation Graph

Loading graph...

References [0]

Sort:

Filter:

No references match the current filters.

Cited by

papers in your library

Cites

papers in your library

Notes