2018

Annotation Artifacts in Natural Language Inference Data

Suchin Gururangan, Swabha Swayamdipta, Omer Levy, Richard Schwartz, S. Bowman, Noah A. Smith

citations

Cite Score

43

AI summary

This paper studies annotation artifacts in NLI datasets like SNLI and MultiNLI, revealing that models can classify hypotheses without premises with high accuracy using fastText. It identifies linguistic phenomena correlated with inference classes and shows NLI models rely heavily on these artifacts, suggesting overestimated performance and a need for balanced datasets.

Main Contributions

  • Identified annotation artifacts in NLI datasets (SNLI and MultiNLI) that allow for hypothesis-only classification.
  • Showed that a simple text categorization model (fastText) can achieve high accuracy classifying hypotheses without observing the premise.
  • Analyzed linguistic phenomena (negation, vagueness) correlated with specific inference classes.
  • Demonstrated that high-performing NLI models rely heavily on annotation artifacts for predictions.
  • Suggested that the success of NLI models may be overestimated due to the presence of annotation artifacts.

Abstract

Large-scale datasets for natural language inference are created by presenting crowd workers with a sentence (premise), and asking them to generate three new sentences (hypotheses) that it entails, contradicts, or is logically neutral with respect to. We show that, in a significant portion of such data, this protocol leaves clues that make it possible to identify the label by looking only at the hypothesis, without observing the premise. Specifically, we show that a simple text categorization model can correctly classify the hypothesis alone in about 67% of SNLI (Bowman et al., 2015) and 53% of MultiNLI (Williams et al., 2018). Our analysis reveals that specific linguistic phenomena such as negation and vagueness are highly correlated with certain inference classes. Our findings suggest that the success of natural language inference models to date has been overestimated, and that the task remains a hard open problem.

Citation Graph

Loading graph...

References [29]

Sort:
Filter:

P. Rajpurkar, J. Zhang, K. Lopyrev, Percy Liang - 2016

37 papers in library cite

Samuel R. Bowman, G. Angeli, Christopher Potts, Christopher D. Manning - 2015

25 papers in library cite

A. Williams, Nikita Nangia, S. Bowman - 2018

19 papers in library cite

K. M. Hermann, T. Kocisky, Edward Grefenstette, L. Espeholt, W. Kay, M. Suleyman, Phil Blunsom - 2015

31 papers in library cite

Ido Dagan, O. Glickman, Bernardo Magnini - 2005

19 papers in library cite

Alexis Conneau, Douwe Kiela, Holger Schwenk, L. Barrault, Antoine Bordes - 2017

11 papers in library cite

R. Jia, Percy Liang - 2017

11 papers in library cite

A. P. Parikh, O. Tackstrom, Dipanjan Das, Jakob Uszkoreit - 2016

11 papers in library cite

Marco Marelli, S. Menini, M. Baroni, L. Bentivogli, R. Bernardi, R. Z. Elli - 2014

7 papers in library cite

Deli Chen, J. Bolton, Christopher D. Manning - 2016

9 papers in library cite

S. Antol, A. Agrawal, J. Lu, M. Mitchell, D. Batra, C. L. Zitnick, D. Parikh - 2015

6 papers in library cite

A. Poliak, J. Naradowsky, A. Haldar, R. Rudinger, B. V. Durme - 2018

5 papers in library cite

Richard Schwartz, Maarten Sap, I. Konstas, L. Zilles, Yejin Choi, Noah A. Smith - 2017

3 papers in library cite

R. Rudinger, C. May, B. V. Durme - 2017

3 papers in library cite

A. Agrawal, D. Batra, D. Parikh - 2016

2 papers in library cite

A. Jabri, Armand Joulin, Laurens Van Der Maaten - 2016

2 papers in library cite

Y. Goyal, Tushar Khot, D. S. Stay, D. Batra, D. Parikh - 2017

1 paper in library cites

N. Mostafazadeh, N. Chambers, X. He, D. Parikh, D. Batra, L. Vanderwende, P. Kohli, J. Allen - 2016

5 papers in library cite

A. Lai, J. Hockenmaier - 2014

5 papers in library cite

Armand Joulin, E. Grave, Piotr Bojanowski, Tomas Mikolov - 2017

4 papers in library cite

I. Dasgupta, Daniel Guo, Andreas Stuhlmuller, S. J. Gershman, N. D. Goodman - 2018

2 papers in library cite

Y. Gong, H. Luo, J. Zhang - 2018

2 papers in library cite

Qinlang Chen, X. D. Zhu, Z. H. Ling, D. Inkpen, S. Wei - 2017

2 papers in library cite

Zhipeng Cai, L. Tu, Kevin Gimpel - 2017

2 papers in library cite

Omer Levy, Ido Dagan - 2016

1 paper in library cites

N. Zeichner, Jonathan Berant, Ido Dagan - 2012

1 paper in library cites

D. Lin, P. Pantel - 2001

1 paper in library cites

Omer Levy, S. Remus, C. Biemann, Ido Dagan - 2015

1 paper in library cites

V. Cirik, L. Morency, T. B. Kirkpatrick - 2018

1 paper in library cites

Cited by

6

papers in your library

Cites

17

papers in your library

Read

on December 30, 2025

Your review

Tags

Paper Aliases

No aliases