2023

Open Problems and Fundamental Limitations of Reinforcement Learning From Human Feedback

S. Casper, X. Davies, C. Shi, T. K. Gilbert, J. Scheurer, J. Rando, R. Freedman, Tomasz Korbak, D. Lindner, P. Freire, T. T. Wang, S. Marks, C. R. Segerie, M. Carroll, A. Peng, P. J. K. Christoffersen, M. Damani, S. Slocum, U. Anwar, A. Siththaranjan, M. Nadeau, E. J. Michaud, J. Pfau, D. Krasheninnikov, X. Chen, L. Langosco, P. Hase, E. Biyik, A. D. Dragan, David Krueger, D. Sadigh, D. H. Menell

citations

Cite Score

38

Citation Graph

Loading graph...

References [0]

Sort:
Filter:

No references match the current filters.

Cited by

1

papers in your library

Cites

0

papers in your library

Notes

Tags

RLHF

Paper Aliases

No aliases