2019
Cite Score
47
AI summary
This paper introduces the HANS dataset to evaluate syntactic heuristics in NLI models, finding that models like BERT rely on fallible heuristics, leading to poor performance on HANS; augmenting training data improves performance, indicating room for progress in NLI systems by addressing these biases.
Main Contributions
Abstract
A machine learning system can score well on a given test set by relying on heuristics that are effective for frequent example types but break down in more challenging cases. We study this issue within natural language inference (NLI), the task of determining whether one sentence entails another. We hypothesize that statistical NLI models may adopt three fallible syntactic heuristics: the lexical overlap heuristic, the subsequence heuristic, and the constituent heuristic. To determine whether models have adopted these heuristics, we introduce a controlled evaluation set called HANS (Heuristic Analysis for NLI Systems), which contains many examples where the heuristics fail. We find that models trained on MNLI, including BERT, a state-of-the-art model, perform very poorly on HANS, suggesting that they have indeed adopted these heuristics. We conclude that there is substantial room for improvement in NLI systems, and that the HANS dataset can motivate and measure progress in this area.
Citation Graph
References [46]
Jacob Devlin, M. W. Chang, K. Lee, Kristina Toutanova - 2018
39 papers in library cite
Samuel R. Bowman, G. Angeli, Christopher Potts, Christopher D. Manning - 2015
25 papers in library cite
A. Williams, Nikita Nangia, S. Bowman - 2018
19 papers in library cite
Ido Dagan, O. Glickman, Bernardo Magnini - 2005
19 papers in library cite
A. P. Parikh, O. Tackstrom, Dipanjan Das, Jakob Uszkoreit - 2016
11 papers in library cite
Suchin Gururangan, Swabha Swayamdipta, Omer Levy, Richard Schwartz, S. Bowman, Noah A. Smith - 2018
6 papers in library cite
Qinlang Chen, X. Zhu, Z. H. Ling, S. Wei, H. Jiang, D. Inkpen - 2017
5 papers in library cite
Matt Gardner, J. Grus, M. Neumann, Oyvind Tafjord, P. Dasigi, N. Liu, M. Peters, M. Schmitz, Luke Zettlemoyer - 2018
5 papers in library cite
Alexis Conneau, German Kruszewski, G. Lample, L. Barrault, M. Baroni - 2018
2 papers in library cite
A. Poliak, J. Naradowsky, A. Haldar, R. Rudinger, B. V. Durme - 2018
5 papers in library cite
A. Naik, A. Ravichander, N. M. Sadeh, C. P. Rose, Graham Neubig - 2018
4 papers in library cite
Nikita Nangia, Samuel R. Bowman - 2019
3 papers in library cite
M. Glockner, V. Shwartz, Y. Goldberg - 2018
3 papers in library cite
A. Agrawal, D. Batra, D. Parikh - 2016
2 papers in library cite
Dan Klein, Christopher D. Manning - 2003
7 papers in library cite
S. Bowman, J. Gauthier, Abhinav Rastogi, R. Gupta, C. Manning, Christopher Potts - 2016
5 papers in library cite
Tal Linzen, E. Dupoux, Y. Goldberg - 2016
5 papers in library cite
C. Condoravdi, D. Crouch, V. D. Paiva, R. Stolle, D. G. Bobrow - 2003
5 papers in library cite
Y. Goldberg - 2019
4 papers in library cite
A. Poliak, A. Haldar, R. Rudinger, J. E. Hu, Ellie Pavlick, A. S. White, B. V. Durme - 2018
4 papers in library cite
Y. Adi, E. Kermany, Yonatan Belinkov, O. Lavi, Y. Goldberg - 2016
4 papers in library cite
K. Gulordava, Piotr Bojanowski, E. Grave, Tal Linzen, M. Baroni - 2018
3 papers in library cite
R. T. Mccoy, Tal Linzen - 2019
3 papers in library cite
R. Marvin, Tal Linzen - 2018
3 papers in library cite
I. Dasgupta, Daniel Guo, Andreas Stuhlmuller, S. J. Gershman, N. D. Goodman - 2018
2 papers in library cite
A. White, P. Rastogi, K. Duh, B. Durme - 2017
2 papers in library cite
A. S. White, R. Rudinger, K. Rawlins, B. V. Durme - 2018
2 papers in library cite
R. T. Mccoy, Robert Frank, Tal Linzen - 2018
2 papers in library cite
Y. Nie, Yuzhi Wang, Mohit Bansal - 2018
1 paper in library cites
A. Ettinger, A. Elgohary, C. Phillips, P. Resnik - 2018
1 paper in library cites
I. Sanchez, J. Mitchell, Sebastian Riedel - 2018
1 paper in library cites
L. Rimell, S. Clark - 2010
1 paper in library cites
A. Williams, A. Drozdov, Samuel R. Bowman - 2018
1 paper in library cites
W. Tabor, B. Galantucci, D. Richardson - 2004
1 paper in library cites
L. Frazier, K. Rayner - 1982
1 paper in library cites
B. Maccartney, Christopher D. Manning
1 paper in library cites
R. Rudinger, A. S. White, B. V. Durme - 2018
1 paper in library cites
R. T. Mccoy, Tal Linzen, E. Dunbar, P. Smolensky - 2019
1 paper in library cites
A. Geiger, I. Cases, L. Karttunen, Christopher Potts - 2018
1 paper in library cites
Y. Mehdad, A. Moschitti, F. M. Zanzotto - 2010
1 paper in library cites
Jeremy Kim, C. Malon, A. Kadav - 2018
1 paper in library cites
Ellie Pavlick, Chris Callison Burch - 2016
1 paper in library cites
T. G. Bever - 1970
1 paper in library cites
N. Weber, L. Shekhar, N. Balasubramanian - 2018
1 paper in library cites
K. Christianson, A. Hollingworth, J. F. Halliwell, F. Ferreira - 2001
1 paper in library cites
J. Wang, Zhengyou Zhang, C. Xie, Y. Zhou, V. Premachandran, Jiacheng Zhu, L. Xie, A. Yuille - 2018
1 paper in library cites
Cited by
5
papers in your library
Cites
14
papers in your library
Read
on December 29, 2025
Your review
Tags
Paper Aliases
No aliases