2016
Cite Score
33
AI summary
This paper introduces the LAMBADA dataset for evaluating language understanding through word prediction, requiring models to track information in the broader discourse; it includes 2662 novels of raw text for training language models, but none of several state-of-the-art language models reaches accuracy above 1%.
Main Contributions
Abstract
We introduce LAMBADA, a dataset to evaluate the capabilities of computational models for text understanding by means of a word prediction task. LAMBADA is a collection of narrative passages sharing the characteristic that human subjects are able to guess their last word if they are exposed to the whole passage, but not if they only see the last sentence preceding the target word. To succeed on LAMBADA, computational models cannot simply rely on local context, but must be able to keep track of information in the broader discourse. We show that LAMBADA exemplifies a wide range of linguistic phenomena, and that none of several state-of-the-art language models reaches accuracy above 1% on this novel benchmark. We thus propose LAMBADA as a challenging test set, meant to encourage the development of new models capable of genuine understanding of broad context in natural language text.
Citation Graph
References [22]
Sepp Hochreiter, Jürgen Schmidhuber - 1997
94 papers in library cite
Tomas Mikolov, K. Chen, G. S. Corrado, Jeffrey Dean - 2013
26 papers in library cite
D. Bahdanau, Kyunghyun Cho, Yoshua Bengio - 2014
59 papers in library cite
Jeffrey L. Elman - 1990
23 papers in library cite
Andreas Stolcke - 2002
13 papers in library cite
Samuel R. Bowman, G. Angeli, Christopher Potts, Christopher D. Manning - 2015
25 papers in library cite
K. M. Hermann, T. Kocisky, Edward Grefenstette, L. Espeholt, W. Kay, M. Suleyman, Phil Blunsom - 2015
31 papers in library cite
Yuxuan Zhu, R. Kiros, R. Zemel, Ruslan Salakhutdinov, R. Urtasun, Antonio Torralba, Sanja Fidler - 2015
18 papers in library cite
S. Sukhbaatar, A. Szlam, Jason Weston, Rob Fergus - 2015
18 papers in library cite
Oriol Vinyals, Quoc V. Le - 2015
7 papers in library cite
Jason Weston, Antoine Bordes, S. Chopra, Tomas Mikolov - 2015
11 papers in library cite
Tim Rocktaschel, Edward Grefenstette, K. Hermann, T. Kocisky, Phil Blunsom - 2016
5 papers in library cite
M. Richardson, C. J. C. Burges, Erin Renshaw - 2013
16 papers in library cite
F. Hill, Antoine Bordes, S. Chopra, Jason Weston - 2015
14 papers in library cite
A. Sordoni, M. Galley, Michael Auli, Chris Brockett, Yangfeng Ji, M. Mitchell, J. Y. Nie, Jianfeng Gao, B. Dolan - 2015
4 papers in library cite
Tomas Mikolov, Armand Joulin, S. Chopra, M. Mathieu, Marc'aurelio Ranzato - 2015
8 papers in library cite
Geoffrey Zweig, C. J. Burges - 2011
6 papers in library cite
Tianle Wang, Kyunghyun Cho - 2015
4 papers in library cite
Yangfeng Ji, T. Cohn, L. Kong, C. Dyer, J. Eisenstein - 2015
3 papers in library cite
Tomas Mikolov, S. Kombrink, A. Deoras, Lukas Burget, Jan Cernocky - 2011
2 papers in library cite
W. Yin, Hinrich Schutze - 2015
1 paper in library cites
Tomas Mikolov - 2014
1 paper in library cites
Cited by
12
papers in your library
Cites
16
papers in your library
Read
on October 31, 2025
Your review
Tags
Paper Aliases
No aliases