2016

SQuAD: 100,000+ Questions for Machine Comprehension of Text

P. Rajpurkar, J. Zhang, K. Lopyrev, Percy Liang

citations

Cite Score

85

AI summary

This paper introduces SQUAD, a new reading comprehension dataset with 100,000+ questions from crowdworkers on Wikipedia articles, where answers are text segments from the passage. It analyzes required reasoning, builds a logistic regression model achieving 51.0% F1, and highlights the challenge for future research due to high human performance (86.8%).

Main Contributions

  • Introduces SQUAD, a large-scale reading comprehension dataset with 100,000+ questions and answers.
  • The dataset requires selecting answers from the passage, unlike previous datasets with answer choices.
  • Analysis of the dataset reveals diverse question types and reasoning challenges.
  • A logistic regression model achieves an F1 score of 51.0% on SQuAD.
  • The dataset is made freely available to encourage further research in reading comprehension.

Abstract

We present the Stanford Question Answering Dataset (SQUAD), a new reading comprehension dataset consisting of 100,000+ questions posed by crowdworkers on a set of Wikipedia articles, where the answer to each question is a segment of text from the corresponding reading passage. We analyze the dataset to understand the types of reasoning required to answer the questions, leaning heavily on dependency and constituency trees. We build a strong logistic regression model, which achieves an F1 score of 51.0%, a significant improvement over a simple baseline (20%). However, human performance (86.8%) is much higher, indicating that the dataset presents a good challenge problem for future research. The dataset is freely available at https://stanford-qa.com.

Citation Graph

Loading graph...

References [27]

Sort:
Filter:

J. Deng, W. Dong, Richard Socher, L. J. Li, K. Li, Li Fei Fei - 2009

28 papers in library cite

M. P. Marcus, B. Santorini, Mary Ann Marcinkiewicz - 1993

22 papers in library cite

K. M. Hermann, T. Kocisky, Edward Grefenstette, L. Espeholt, W. Kay, M. Suleyman, Phil Blunsom - 2015

31 papers in library cite

Jason Weston, Antoine Bordes, S. Chopra, Tomas Mikolov - 2015

11 papers in library cite

M. Richardson, C. J. C. Burges, Erin Renshaw - 2013

16 papers in library cite

F. Hill, Antoine Bordes, S. Chopra, Jason Weston - 2015

14 papers in library cite

Deli Chen, J. Bolton, Christopher D. Manning - 2016

9 papers in library cite

Shijie Wang, J. J. Jiang - 2017

6 papers in library cite

Jonathan Berant, Vivek Srikumar, P. C. Chen, B. Huang, B. Manning, Peter Clark - 2014

4 papers in library cite

Yining Yang, W. T. Yih, C. Meek - 2015

4 papers in library cite

D. Ferrucci, E. Brown, J. C. Carroll, J. Fan, D. Gondek, A. A. Kalyanpur, A. Lally, J. W. Murdock, E. Nyberg, J. Prager, N. Schlaefer, C. Welty - 2013

3 papers in library cite

Nate Kushman, Y. Artzi, Luke Zettlemoyer, R. Barzilay - 2014

3 papers in library cite

Haiming Wang, Mohit Bansal, Kevin Gimpel, D. Mcallester - 2015

3 papers in library cite

E. Riloff, M. Thelen - 2000

2 papers in library cite

E. M. Voorhees, D. M. Tice - 2000

2 papers in library cite

S. N. Gaikwad, D. Morina, R. Nistala, M. Agarwal, A. Cossette, R. Bhanu, S. Savage, V. Narwal, K. Rajpal, J. Regino - 2015

2 papers in library cite

L. Hirschman, M. Light, E. Breck, J. D. Burger - 1999

2 papers in library cite

M. Sachan, A. Dubey, E. P. Xing, M. Richardson - 2015

2 papers in library cite

M. J. Hosseini, Hananneh Hajishirzi, Oren Etzioni, Nate Kushman - 2014

2 papers in library cite

Peter Clark, Oren Etzioni - 2016

2 papers in library cite

H. T. Ng, L. H. Teo, J. L. P. Kwan - 2000

1 paper in library cites

E. Brill, S. Dumais, M. Banko - 2002

1 paper in library cites

Huan Sun, N. Duan, Y. Duan, M. Zhou - 2013

1 paper in library cites

D. Shen, D. Klakow - 2006

1 paper in library cites

D. Ravichandran, Eduard Hovy - 2002

1 paper in library cites

K. Narasimhan, R. Barzilay - 2015

1 paper in library cites

M. Shirakawa, T. Hara, S. Nishio - 2015

1 paper in library cites

Cited by

37

papers in your library

Cites

8

papers in your library

Read

on October 17, 2025

Your review

Tags

Paper Aliases

No aliases