2020
Cite Score
13
AI summary
This paper critiques the current NLP leaderboard paradigm, arguing that its focus on performance-based evaluation neglects other important model qualities like fairness and efficiency, and proposes increased transparency by reporting practical statistics (e.g., model size, energy efficiency) to better align with user utility.
Main Contributions
Abstract
Benchmarks such as GLUE have helped drive advances in NLP by incentivizing the creation of more accurate models. While this leaderboard paradigm has been remarkably successful, a historical focus on performance-based evaluation has been at the expense of other qualities that the NLP community values in models, such as compactness, fairness, and energy efficiency. In this opinion paper, we study the divergence between what is incentivized by leaderboards and what is useful in practice through the lens of microeconomic theory. We frame both the leaderboard and NLP practitioners as consumers and the benefit they get from a model as its utility to them. With this framing, we formalize how leaderboards in their current form can be poor proxies for the NLP community at large. For example, a highly inefficient model would provide less utility to practitioners but not to a leaderboard, since it is a cost that only the former must bear. To allow practitioners to better estimate a model's utility to them, we advocate for more transparency on leaderboards, such as the reporting of statistics that are of practical concern (e.g., model size, energy efficiency, and inference latency).
Citation Graph
References [56]
Jacob Devlin, M. W. Chang, K. Lee, Kristina Toutanova - 2018
39 papers in library cite
Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei - 2020
21 papers in library cite
Thomas Wolf - 2019
6 papers in library cite
P. Rajpurkar, J. Zhang, K. Lopyrev, Percy Liang - 2016
37 papers in library cite
A. Wang, A. Singh, J. Michael, F. Hill, Omer Levy, Samuel R. Bowman - 2018
26 papers in library cite
E. Strubell, A. Ganesh, Andrew Mccallum - 2019
3 papers in library cite
Samuel R. Bowman, G. Angeli, Christopher Potts, Christopher D. Manning - 2015
25 papers in library cite
P. Rajpurkar, R. Jia, Percy Liang - 2018
14 papers in library cite
A. Wang, Y. Pruksachatkun, Nikita Nangia, A. Singh, J. Michael, F. Hill, Omer Levy, Samuel R. Bowman - 2019
15 papers in library cite
R. Jia, Percy Liang - 2017
11 papers in library cite
Richard Socher - 2018
9 papers in library cite
M. Mitchell, S. Wu, A. Zaldivar, P. Barnes, L. Vasserman, B. Hutchinson, E. Spitzer, I. D. Raji, T. Gebru - 2018
5 papers in library cite
S. Arora, Yiqing Liang, T. Ma - 2017
4 papers in library cite
A. Blum, Moritz Hardt - 2015
2 papers in library cite
R. Rudinger, J. Naradowsky, B. Leonard, B. V. Durme - 2018
6 papers in library cite
J. Zhao, Tianle Wang, M. Yatskar, V. Ordonez, K. W. Chang - 2018
3 papers in library cite
S. L. Blodgett, S. Barocas, H. D. Iii, H. Wallach - 2020
7 papers in library cite
E. M. Bender, B. Friedman - 2018
4 papers in library cite
E. Agirre, C. Banea, C. Cardie, D. M. Cer, M. T. Diab, A. G. Agirre, W. Guo, R. Mihalcea, G. Rigau, J. Wiebe - 2014
4 papers in library cite
Moin Nadeem, A. Bethke, Siva Reddy - 2020
4 papers in library cite
Y. Nie, A. Williams, E. Dinan, Mohit Bansal, Jason Weston, Douwe Kiela - 2019
3 papers in library cite
T. Gebru, J. Morgenstern, B. Vecchione, J. W. Vaughan, H. Wallach, H. D. Iii, K. Crawford - 2018
3 papers in library cite
Tal Linzen - 2020
3 papers in library cite
J. Dodge, Suchin Gururangan, D. Card, Richard Schwartz, Noah A. Smith - 2019
3 papers in library cite
K. Clark, M. T. Luong, Quoc V. Le, Christopher D. Manning - 2020
2 papers in library cite
S. Bordia, S. Bowman - 2019
2 papers in library cite
John Miller, Karl Krauth, Benjamin Recht, Ludwig Schmidt - 2020
2 papers in library cite
W. E. Zhang, Q. Z. Sheng, A. Alhazmi, Chun-Liang Li - 2020
1 paper in library cites
T. Manzini, L. Y. Chong, A. W. Black, Y. Tsvetkov - 2019
1 paper in library cites
A. Raghunathan, Jacob Steinhardt, Percy Liang - 2018
1 paper in library cites
R. Jia, A. Raghunathan, K. Goksel, Percy Liang - 2019
1 paper in library cites
Moritz Hardt - 2017
1 paper in library cites
P. A. Samuelson - 1948
1 paper in library cites
Y. Oren, S. Sagawa, Tatsunori Hashimoto, Percy Liang - 2019
1 paper in library cites
L. Hou, L. Shang, Xu Jiang, Qian Liu - 2020
1 paper in library cites
Moritz Hardt, E. Price, N. Srebro - 2016
1 paper in library cites
S. Barocas, Moritz Hardt, A. Narayanan - 2017
1 paper in library cites
Tatsunori Hashimoto, M. Srivastava, H. Namkoong, Percy Liang - 2018
1 paper in library cites
A. H. Zadeh, A. Moshovos - 2020
1 paper in library cites
A. Rogers - 2019
1 paper in library cites
Kawin Ethayarajh - 2020
1 paper in library cites
J. Rawls - 2001
1 paper in library cites
Y. Mao, Yuzhi Wang, Chiyu Wu, Chiyuan Zhang, Yuzhi Wang, Yining Yang, Q. Zhang, Y. Tong, Jinze Bai - 2020
1 paper in library cites
B. Sundheim - 1995
1 paper in library cites
B. Dorr - 2011
1 paper in library cites
A. Rogers - 2020
1 paper in library cites
N. G. Mankiw - 2020
1 paper in library cites
M. Crane - 2018
1 paper in library cites
Kawin Ethayarajh - 2019
1 paper in library cites
E. Agirre, D. Cer, M. Diab, A. G. Agirre, W. Guo - 2013
1 paper in library cites
E. Agirre, C. Banea, C. Cardie, D. M. Cer, M. T. Diab, A. G. Agirre, W. Guo, I. L. Gazpio, M. Maritxalar, R. Mihalcea - 2015
1 paper in library cites
A. Raghunathan, Jacob Steinhardt, P. S. Liang - 2018
1 paper in library cites
Kawin Ethayarajh, David Duvenaud, G. Hirst - 2019
1 paper in library cites
Kawin Ethayarajh, David Duvenaud, G. Hirst - 2019
1 paper in library cites
Kawin Ethayarajh - 2018
1 paper in library cites
Cited by
3
papers in your library
Cites
17
papers in your library
Read
on June 2, 2026
Your review
Tags
Paper Aliases
No aliases