2017
Cite Score
54
AI summary
This paper introduces an adversarial evaluation method for the Stanford Question Answering Dataset (SQuAD). It demonstrates that existing reading comprehension models are vulnerable to adversarial examples, where distracting sentences are added to the input paragraph. The accuracy of sixteen published models drops significantly in this adversarial setting.
Main Contributions
Abstract
Standard accuracy metrics indicate that reading comprehension systems are making rapid progress, but the extent to which these systems truly understand language remains unclear. To reward systems with real language understanding abilities, we propose an adversarial evaluation scheme for the Stanford Question Answering Dataset (SQuAD). Our method tests whether systems can answer questions about paragraphs that contain adversarially inserted sentences, which are automatically generated to distract computer systems without changing the correct answer or misleading humans. In this adversarial setting, the accuracy of sixteen published models drops from an average of 75% F1 score to 36%; when the adversary is allowed to add ungrammatical sequences of words, average accuracy on four models decreases further to 7%. We hope our insights will motivate the development of new models that understand language more precisely.
Citation Graph
References [35]
Ian J. Goodfellow, J. P. Abadie, M. Mirza, B. Xu, D. W. Farley, S. Ozair, Aaron Courville, Yoshua Bengio - 2014
2 papers in library cite
Jeffrey Pennington, Richard Socher, Christopher D. Manning - 2014
31 papers in library cite
Ian J. Goodfellow, J. Shlens, Christian Szegedy - 2015
4 papers in library cite
Rob Fergus - 2014
7 papers in library cite
P. Rajpurkar, J. Zhang, K. Lopyrev, Percy Liang - 2016
37 papers in library cite
M. Seo, A. Kembhavi, Ali Farhadi, Hananneh Hajishirzi - 2017
13 papers in library cite
D. Paperno, German Kruszewski, A. Lazaridou, N. Q. Pham, R. Bernardi, S. Pezzelle, M. Baroni, G. Boleda, Raquel Fernandez - 2016
12 papers in library cite
Hector J. Levesque - 2013
2 papers in library cite
C. Fellbaum - 1998
12 papers in library cite
R. Jozefowicz, Samy Bengio - 2016
1 paper in library cites
Jeffrey Li, W. Monroe, T. Shi, A. Ritter, Dan Jurafsky - 2017
1 paper in library cites
Shijie Wang, J. J. Jiang - 2017
6 papers in library cite
Dirk Weissenborn, G. Wiese, L. Seiffe - 2017
4 papers in library cite
Christopher D. Manning, M. Surdeanu, J. Bauer, J. Finkel, S. J. Bethard, D. Mcclosky - 2014
6 papers in library cite
K. Lee, S. Salant, T. Kwiatkowski, A. P. Parikh, Dipanjan Das, Jonathan Berant - 2017
3 papers in library cite
Y. Shen, P. Huang, Jianfeng Gao, Weizhu Chen - 2017
3 papers in library cite
W. N. Francis, H. Kucera - 1979
2 papers in library cite
A. Globerson, S. Roweis - 2006
2 papers in library cite
N. Dalvi, P. Domingos, Mausam, S. Sanghai, D. Verma - 2004
1 paper in library cites
Noah A. Smith - 2012
1 paper in library cites
D. Lowd, C. Meek - 2005
1 paper in library cites
E. M. Bender, H. D. Iii, A. Ettinger, H. Kannan, S. Rao, E. Rothschild - 2017
1 paper in library cites
R. Jia, Percy Liang - 2016
1 paper in library cites
Y. Yu, Wenxuan Zhang, K. Hasan, M. Yu, Bing Xiang, B. Zhou - 2016
1 paper in library cites
J. Zhang, X. Zhu, Qinlang Chen, L. Dai, S. Wei, H. Jiang - 2017
1 paper in library cites
N. Madnani, B. J. Dorr - 2010
1 paper in library cites
M. Hu, Y. Peng, X. Qiu - 2017
1 paper in library cites
Zhengtao Wang, H. Mi, W. Hamza, R. Florian - 2016
1 paper in library cites
N. Papernot, P. Mcdaniel, I. Goodfellow, S. Jha, Z. Celik, A. Swami - 2017
1 paper in library cites
Y. Gong, Samuel R. Bowman - 2017
1 paper in library cites
N. Narodytska, S. P. Kasiviswanathan - 2016
1 paper in library cites
Rosanne Liu, Jiaxi Hu, W. Wei, Zhilin Yang, E. Nyberg - 2017
1 paper in library cites
M. Marcus, B. Santorini, Mary Ann Marcinkiewicz, A. Taylor - 1999
1 paper in library cites
L. Rimell, S. Clark, M. Steedman - 2009
1 paper in library cites
S. M. Dezfooli, A. Fawzi, O. Fawzi, P. Frossard - 2017
1 paper in library cites
Cited by
11
papers in your library
Cites
13
papers in your library
Read
on November 4, 2025
Your review
Tags
Paper Aliases
No aliases