2021
WebGPT: Browser-Assisted Question-Answering With Human Feedback
Reiichiro Nakano, Jacob Hilton, Suchir Balaji, Jeffrey Wu, Long Ouyang, Christina Kim, Christopher Hesse, Shantanu Jain, Vineet Kosaraju, William Saunders, Xu Jiang, Karl Cobbe, Tyna Eloundou, Gretchen Krueger, Kevin Button, Matthew Knight, Benjamin Chess, John Schulman
Cite Score
56
AI summary
This paper introduces WebGPT, a GPT-3-based model fine-tuned for long-form question-answering using a text-based web-browsing environment and human feedback. It collects references while browsing to support answers and achieves human-competitive performance on the ELI5 dataset.
Main Contributions
Abstract
We fine-tune GPT-3 to answer long-form questions using a text-based web-browsing environment, which allows the model to search and navigate the web. By setting up the task so that it can be performed by humans, we are able to train models on the task using imitation learning, and then optimize answer quality with human feedback. To make human evaluation of factual accuracy easier, models must collect references while browsing in support of their answers. We train and evaluate our models on ELI5, a dataset of questions asked by Reddit users. Our best model is obtained by fine-tuning GPT-3 using behavior cloning, and then performing rejection sampling against a reward model trained to predict human preferences. This model's answers are preferred by humans 56% of the time to those of our human demonstrators, and 69% of the time to the highest-voted answer from Reddit.
Citation Graph
References [30]
Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei - 2020
21 papers in library cite
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, Oleg Klimov - 2017
10 papers in library cite
P. Lewis, Ethan Perez, A. Piktus, F. Petroni, V. Karpukhin, N. Goyal, H. Kuttler, Martha Lewis, W. T. Yih, Tim Rocktaschel, Sebastian Riedel, K. Douwe - 2020
5 papers in library cite
Nisan Stiennon, Long Ouyang, Jeffrey Wu, Daniel M. Ziegler, Ryan Lowe, Chelsea Voss, Alec Radford, Dario Amodei, Paul Christiano - 2020
10 papers in library cite
M. Joshi, E. Choi, D. Weld, Luke Zettlemoyer - 2017
18 papers in library cite
Jan Leike, David Krueger, Tom Everitt, Miljan Martic, Vishal Maini, Shane Legg - 2018
5 papers in library cite
Paul Christiano, Buck Shlegeris, Dario Amodei - 2018
7 papers in library cite
Stephen Lin, Jacob Hilton, Owain Evans - 2022
4 papers in library cite
K. Guu, K. Lee, Z. Tung, P. Panupong, M. W. Chang - 2020
5 papers in library cite
J. Maynez, Shashi Narayan, B. Bohnet, R. Mcdonald - 2020
6 papers in library cite
K. Shuster, S. Poff, Mark Chen, Douwe Kiela, Jason Weston - 2021
3 papers in library cite
K. Goddard, A. Roudsari, J. C. Wyatt - 2012
1 paper in library cites
A. Fan, Yacine Jernite, Ethan Perez, D. Grangier, Jason Weston, Michael Auli - 2019
4 papers in library cite
Owain Evans, O. C. Barratt, L. Finnveden, A. Bales, A. Balwit, P. Wills, L. Righetti, William Saunders - 2021
1 paper in library cites
Geoffrey Irving, Paul Christiano, Dario Amodei - 2018
8 papers in library cite
L. Adolphs, B. Boerschinger, C. Buck, M. C. Huebscher, M. Ciaramita, L. Espeholt, T. Hofmann, Y. Kilcher - 2021
1 paper in library cites
N. Bostrom - 2014
5 papers in library cite
B. T. Polyak, A. B. Juditsky - 1992
4 papers in library cite
D. Ferrucci, E. Brown, J. C. Carroll, J. Fan, D. Gondek, A. A. Kalyanpur, A. Lally, J. W. Murdock, E. Nyberg, J. Prager, N. Schlaefer, C. Welty - 2013
3 papers in library cite
V. Karpukhin, B. Ouguz, S. Min, L. Y. Wu, S. Edunov, Deli Chen, W. T. Yih - 2020
3 papers in library cite
M. Harms - 2016
1 paper in library cites
D. Chong, J. N. Druckman - 2007
1 paper in library cites
K. Krishna, A. Roy, M. Iyyer - 2021
1 paper in library cites
X. Yuan, J. Fu, M. A. Cote, Yi Tay, C. Pal, A. Trischler - 2019
1 paper in library cites
I. Gur, U. Rueckert, A. Faust, D. H. Tur - 2018
1 paper in library cites
P. Lewis, P. Stenetorp, Sebastian Riedel - 2020
1 paper in library cites
D. Metzler, Yi Tay, D. Bahri, M. Najork - 2021
1 paper in library cites
S. Bhakthavatsalam, Daniel Khashabi, Tushar Khot, B. D. Mishra, Kyle Richardson, Ashish Sabharwal, Carissa Schoenick, Oyvind Tafjord, Peter Clark - 2021
1 paper in library cites
H. Cheng, Y. Shen, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao - 2021
1 paper in library cites
T. Shi, A. Karpathy, L. Fan, J. Hernandez, Percy Liang - 2017
1 paper in library cites
Cited by
7
papers in your library
Cites
15
papers in your library
Read
on May 26, 2026
Your review
Tags
Paper Aliases
No aliases