2013

MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text

M. Richardson, C. J. C. Burges, Erin Renshaw

citations

Cite Score

34

AI summary

This paper introduces MCTest, a new dataset of 500 fictional stories with multiple-choice questions, created using crowdsourcing, to advance open-domain machine comprehension, testing abilities like causal reasoning with a clear metric and restricted world knowledge.

Main Contributions

  • Introduces MCTest, a freely available dataset for machine comprehension of text.
  • Presents a scalable crowd-sourcing method to construct a dataset of 500 stories and 2000 questions.
  • Provides a clear metric for advancement on the machine comprehension of text.
  • Introduces two baseline systems using lexical features, achieving approximately 58% accuracy on test data.
  • Releases the dataset to encourage research and innovation in machine comprehension.

Abstract

We present MCTest, a freely available set of stories and associated questions intended for research on the machine comprehension of text. Previous work on machine comprehension (e.g., semantic modeling) has made great strides, but primarily focuses either on limited-domain datasets, or on solving a more restricted goal (e.g., open-domain relation extraction). In contrast, MCTest requires machines to answer multiple-choice reading comprehension questions about fictional stories, directly tackling the high-level goal of open-domain machine comprehension. Reading comprehension can test advanced abilities such as causal reasoning and understanding the world, yet, by being multiple-choice, still provide a clear metric. By being fictional, the answer typically can be found only in the story itself. The stories and questions are also carefully limited to those a young child would understand, reducing the world knowledge that is required for the task. We present the scalable crowd-sourcing methods that allow us to cheaply construct a dataset of 500 stories and 2000 questions. By screening workers (with grammar tests) and stories (with grading), we have ensured that the data is the same quality as another set that we manually edited, but at one tenth the editing cost. By being open-domain, yet carefully restricted, we hope MCTest will serve to encourage research and provide a clear metric for advancement on the machine comprehension of text.

Citation Graph

Loading graph...

References [25]

Sort:
Filter:

Ido Dagan, O. Glickman, Bernardo Magnini - 2005

19 papers in library cite

E. M. Voorhees, D. M. Tice - 1999

5 papers in library cite

L. Hirschman, M. Light, E. Breck, J. D. Burger - 1999

2 papers in library cite

Geoffrey Zweig, C. J. C. Burges - 2012

1 paper in library cites

A. Stern, Ido Dagan - 2011

1 paper in library cites

E. Stamatatos - 2009

1 paper in library cites

V. Kuperman, H. S. Gonzalez, M. Brysbaert - 2012

1 paper in library cites

Peter Clark, P. Harrison, Xingcheng Yao - 2012

1 paper in library cites

M. Agarwal, P. Mannem - 2011

1 paper in library cites

J. L. Leidner, T. Dalmas, B. Webber, J. Bos, C. Grover - 2003

1 paper in library cites

D. Goldwasser, R. Reichart, J. Clarke, Dan Roth - 2011

1 paper in library cites

S. Cucerzan, E. Agichtein - 2005

1 paper in library cites

L. S. Zettlemoyer, Michael Collins - 2009

1 paper in library cites

E. Grois, D. C. Wilkins - 2005

1 paper in library cites

J. M. Zelle, R. J. Mooney - 1996

1 paper in library cites

Missing year

F. D. Buc

1 paper in library cites

E. Breck, M. Light, G. S. Mann, E. Riloff, B. Brown, P. Anand, M. Rooth, M. Thelen - 2001

1 paper in library cites

S. M. Harabagiu, S. J. Maiorano, M. A. Pasca - 2003

1 paper in library cites

B. Wellner, L. Ferro, W. Greiff, L. Hirschman - 2005

1 paper in library cites

G. Paolacci, J. Chandler, P. Iperirotis - 2010

1 paper in library cites

E. T. Mueller - 2010

1 paper in library cites

J. Horton, L. Chilton - 2010

1 paper in library cites

E. Charniak - 1972

1 paper in library cites

L. R. Tang, R. J. Mooney - 2001

1 paper in library cites

G. Tur, D. H. Tur, L. Heck - 2010

1 paper in library cites

Cited by

16

papers in your library

Cites

1

papers in your library

Read

on December 26, 2025

Your review

Tags

Paper Aliases

No aliases