2025
Monitoring Reasoning Models for Misbehavior and the Risks of Promoting Obfuscation
Bowen Baker, J. Huizinga, Leo Gao, Z. Dou, M. Y. Guan, A. Madry, Wojciech Zaremba, J. Pachocki, D. Farhi
Cite Score
13
Citation Graph
References [0]
No references match the current filters.
Cited by
2
papers in your library
Cites
0
In reading list
Notes
Tags
Paper Aliases
No aliases