2017

A Deep Reinforced Model for Abstractive Summarization

R. Paulus, Caiming Xiong, Richard Socher

citations

Cite Score

55

AI summary

This paper introduces an abstractive summarization model that uses a novel intra-attention mechanism and reinforcement learning to address the repeating phrase problem, achieving state-of-the-art ROUGE-1 score of 41.16 on the CNN/Daily Mail dataset and good results on the New York Times dataset.

Main Contributions

  • Introduces a novel intra-attention mechanism that attends over the input and continuously generated output separately.
  • Proposes a new training method that combines supervised word prediction and reinforcement learning (RL).
  • Achieves a 41.16 ROUGE-1 score on the CNN/Daily Mail dataset, surpassing previous state-of-the-art models.
  • Demonstrates through human evaluation that the model produces higher quality summaries.
  • Presents the first end-to-end model for abstractive summarization on the NYT dataset.

Abstract

Attentional, RNN-based encoder-decoder models for abstractive summarization have achieved good performance on short input and output sequences. For longer documents and summaries however these models often include repetitive and incoherent phrases. We introduce a neural network model with a novel intra attention that attends over the input and continuously generated output separately, and a new training method that combines standard supervised word prediction and reinforcement learning (RL). Models trained only with supervised learning often exhibit “exposure bias” – they assume ground truth is provided at each step during training. However, when standard word prediction is combined with the global sequence prediction training of RL the resulting summaries become more readable. We evaluate this model on the CNN/Daily Mail and New York Times datasets. Our model obtains a 41.16 ROUGE-1 score on the CNN/Daily Mail dataset, an improvement over previous state-of-the-art models. Human evaluation also shows that our model produces higher quality summaries.

Citation Graph

Loading graph...

References [39]

Sort:
Filter:

D. P. Kingma, Jimmy Lei Ba - 2014

49 papers in library cite

Sepp Hochreiter, Jürgen Schmidhuber - 1997

94 papers in library cite

Tomas Mikolov, Ilya Sutskever, K. Chen, G. S. Corrado, Jeffrey Dean - 2013

32 papers in library cite

Jeffrey Pennington, Richard Socher, Christopher D. Manning - 2014

31 papers in library cite

D. Bahdanau, Kyunghyun Cho, Yoshua Bengio - 2014

59 papers in library cite

Ilya Sutskever, Oriol Vinyals, Quoc V. Le - 2014

58 papers in library cite

Chin Yew Lin - 2004

9 papers in library cite

R. Williams - 1992

11 papers in library cite

Yonghui Wu, M. Schuster, Ziru Chen, Quoc V. Le, M. Norouzi, W. Macherey, M. Krikun, Yue Cao, Q. Gao, K. Macherey, J. Klingner, A. Shah, M. J. Johnson, Xiaodong Liu, Lukasz Kaiser, S. Gouws, Y. Kato, T. Kudo, H. Kazawa, K. Stevens, G. Kurian, N. Patil, Wenyi Wang, C. Young, J. Smith, J. Riesa, A. Rudnick, Oriol Vinyals, G. S. Corrado, M. Hughes, Jeffrey Dean - 2016

15 papers in library cite

K. M. Hermann, T. Kocisky, Edward Grefenstette, L. Espeholt, W. Kay, M. Suleyman, Phil Blunsom - 2015

31 papers in library cite

Oriol Vinyals, M. Fortunato, Navdeep Jaitly - 2015

10 papers in library cite

A. See, P. J. Liu, Christopher D. Manning - 2017

8 papers in library cite

Alexander M. Rush, S. Chopra, Jason Weston - 2015

13 papers in library cite

S. Merity, Caiming Xiong, J. Bradbury, Richard Socher - 2017

12 papers in library cite

R. Nallapati, B. Zhou, C. N. D. Santos, C. G. Gulcehre, Bing Xiang - 2016

10 papers in library cite

Mirella Lapata - 2016

8 papers in library cite

O. Press, Lior Wolf - 2017

7 papers in library cite

R. Williams, David Zipser - 1989

8 papers in library cite

S. Rennie, E. Marcheret, Y. Mroueh, J. Ross, V. Goel - 2016

1 paper in library cites

Marc'aurelio Ranzato, S. Chopra, Michael Auli, Wojciech Zaremba - 2015

6 papers in library cite

C. L. Liu, Ryan Lowe, I. Serban, M. Noseworthy, L. Charlin, J. Pineau - 2016

3 papers in library cite

S. Chopra, Michael Auli, A. Rush, S. Harvard - 2016

5 papers in library cite

C. G. Gulcehre, S. Ahn, R. Nallapati, B. Zhou, Yoshua Bengio - 2016

7 papers in library cite

M. Norouzi, Samy Bengio, Navdeep Jaitly, M. Schuster, Yonghui Wu, Dale Schuurmans - 2016

2 papers in library cite

B. Sankaran, H. Mi, Y. A. Onaizan, A. Ittycheriah - 2016

3 papers in library cite

Christopher D. Manning, M. Surdeanu, J. Bauer, J. Finkel, S. J. Bethard, D. Mcclosky - 2014

6 papers in library cite

H. Inan, K. Khosravi, Richard Socher - 2017

6 papers in library cite

B. Dorr, D. Zajic, Richard Schwartz - 2003

3 papers in library cite

E. Sandhaus - 2008

3 papers in library cite

Qinlang Chen, X. Zhu, Z. Ling, S. Wei, H. Jiang - 2016

2 papers in library cite

W. Zeng, W. Luo, Sanja Fidler, R. Urtasun - 2016

2 papers in library cite

R. Nallapati, F. Zhai, B. Zhou - 2017

2 papers in library cite

Yining Yang, A. Nenkova - 2014

1 paper in library cites

B. Nye, A. Nenkova - 2015

1 paper in library cites

A. Venkatraman, M. Hebert, J. Bagnell - 2015

1 paper in library cites

G. Durrett, T. B. Kirkpatrick, Dan Klein - 2016

1 paper in library cites

K. Hong, M. Marcus, A. Nenkova - 2015

1 paper in library cites

Jeffrey Li, K. Thadani, A. Stent - 2016

1 paper in library cites

Cited by

7

papers in your library

Cites

25

papers in your library

Read

on August 11, 2025

Your review

Tags

Paper Aliases

No aliases