2016

A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task

Deli Chen, J. Bolton, Christopher D. Manning

citations

Cite Score

28

AI summary

This paper examines the CNN/Daily Mail reading comprehension task, achieving state-of-the-art results using carefully designed systems that obtain accuracies of 73.6% and 76.6% on the CNN and Daily Mail datasets respectively, and provide an in-depth analysis of the dataset.

Main Contributions

  • The paper provides a thorough examination of the CNN/Daily Mail reading comprehension task.
  • It demonstrates that simple, carefully designed systems can achieve high, state-of-the-art accuracies on the task.
  • The paper provides insights into the limitations of the dataset and the challenges of achieving further improvements.
  • It shows that current systems much more have the nature of single-sentence relation extraction systems than larger-discourse-context text understanding systems.
  • The systems presented are close to the ceiling of performance for single-sentence and unambiguous cases of this dataset.

Abstract

Enabling a computer to understand a document so that it can answer comprehension questions is a central, yet unsolved goal of NLP. A key factor impeding its solution by machine learned systems is the limited availability of human-annotated data. Hermann et al. (2015) seek to solve this problem by creating over a million training examples by pairing CNN and Daily Mail news articles with their summarized bullet points, and show that a neural network can then be trained to give good performance on this task. In this paper, we conduct a thorough examination of this new reading comprehension task. Our primary aim is to understand what depth of language understanding is required to do well on this task. We approach this from one side by doing a careful hand-analysis of a small subset of the problems and from the other by showing that simple, carefully designed systems can obtain accuracies of 73.6% and 76.6% on these two datasets, exceeding current state-of-the-art results by 7-10% and approaching what we believe is the ceiling for performance on this task.

Citation Graph

Loading graph...

References [20]

Sort:
Filter:

Jeffrey Pennington, Richard Socher, Christopher D. Manning - 2014

31 papers in library cite

Kyunghyun Cho, B. V. Merrienboer, C. G. Gulcehre, D. Bahdanau, F. Bougares, Holger Schwenk, Yoshua Bengio - 2014

38 papers in library cite

T. Luong, H. Pham, Christopher D. Manning - 2015

15 papers in library cite

K. M. Hermann, T. Kocisky, Edward Grefenstette, L. Espeholt, W. Kay, M. Suleyman, Phil Blunsom - 2015

31 papers in library cite

S. Sukhbaatar, A. Szlam, Jason Weston, Rob Fergus - 2015

18 papers in library cite

Jason Weston, S. Chopra, Antoine Bordes - 2015

18 papers in library cite

Jason Weston, Antoine Bordes, S. Chopra, Tomas Mikolov - 2015

11 papers in library cite

M. Richardson, C. J. C. Burges, Erin Renshaw - 2013

16 papers in library cite

F. Hill, Antoine Bordes, S. Chopra, Jason Weston - 2015

14 papers in library cite

A. Kumar, O. Irsoy, P. Ondruska, M. Iyyer, J. Bradbury, I. Gulrajani, Victor Zhong, R. Paulus, Richard Socher - 2015

9 papers in library cite

R. Kadlec, M. Schmid, O. Bajgar, Jan Kleindienst - 2016

7 papers in library cite

Jonathan Berant, Vivek Srikumar, P. C. Chen, B. Huang, B. Manning, Peter Clark - 2014

4 papers in library cite

Deli Chen, C. Manning - 2014

3 papers in library cite

S. Kobayashi, R. Tian, N. Okazaki, K. Inui - 2016

3 papers in library cite

Haiming Wang, Mohit Bansal, Kevin Gimpel, D. Mcallester - 2015

3 papers in library cite

Q. Wu, C. J. Burges, K. M. Svore, Jianfeng Gao - 2010

2 papers in library cite

M. Sachan, A. Dubey, E. P. Xing, M. Richardson - 2015

2 papers in library cite

P. Norvig - 1978

1 paper in library cites

M. Lee, X. He, W. T. Yih, Jianfeng Gao, L. Deng, P. Smolensky - 2016

1 paper in library cites

C. J. C. Burges - 2013

1 paper in library cites

Cited by

9

papers in your library

Cites

11

papers in your library

Read

on November 25, 2025

Your review

Tags

Paper Aliases

No aliases