Cite Score
5
AI summary
This paper introduces an extension of the Neural GPU model with active memory, achieving good results for neural machine translation. The model clarifies the relationship between attention and active memory and demonstrates its effectiveness, particularly on longer sentences, outperforming traditional attention mechanisms.
Main Contributions
Abstract
Several mechanisms to focus attention of a neural network on selected parts of its input or memory have been used successfully in deep learning models in recent years. Attention has improved image classification, image captioning, speech recognition, generative models, and learning algorithmic tasks, but it had probably the largest impact on neural machine translation. Recently, similar improvements have been obtained using alternative mechanisms that do not focus on a single part of a memory but operate on all of it in parallel, in a uniform way. Such mechanism, which we call active memory, improved over attention in algorithmic tasks, image processing, and in generative modelling. So far, however, active memory has not improved over attention for most natural language processing tasks, in particular for machine translation. We analyze this shortcoming in this paper and propose an extended model of active memory that matches existing attention models on neural machine translation and generalizes better to longer sentences. We investigate this model and explain why previous active memory models did not succeed. Finally, we discuss when active memory brings most benefits and where attention can be a better choice.
Citation Graph
References [27]
K. He, X. Zhang, S. Ren, Jian Sun - 2016
20 papers in library cite
D. P. Kingma, Jimmy Lei Ba - 2014
49 papers in library cite
Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton - 2012
71 papers in library cite
Sepp Hochreiter, Jürgen Schmidhuber - 1997
94 papers in library cite
D. Bahdanau, Kyunghyun Cho, Yoshua Bengio - 2014
59 papers in library cite
Kyunghyun Cho, B. V. Merrienboer, C. G. Gulcehre, D. Bahdanau, F. Bougares, Holger Schwenk, Yoshua Bengio - 2014
38 papers in library cite
Ilya Sutskever, Oriol Vinyals, Quoc V. Le - 2014
58 papers in library cite
R. Williams - 1992
11 papers in library cite
K. Xu, Jimmy Lei Ba, R. Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, R. Zemel, Yoshua Bengio - 2015
12 papers in library cite
M. Abadi, Akshat Agarwal, P. Barham, E. Brevdo, Ziru Chen, C. Citro, G. Corrado, A. Davis, Jeffrey Dean, M. Devin, Sanjay Ghemawat, I. Goodfellow, A. Harp, Geoffrey Irving, M. Isard, Y. Jia, R. Jozefowicz, Lukasz Kaiser, M. Kudlur, J. Levenberg, D. Mane, R. Monga, S. Moore, D. Murray, Christopher Olah, M. Schuster, J. Shlens, B. Steiner, Ilya Sutskever, K. Talwar, P. Tucker, Vincent Vanhoucke, V. Vasudevan, F. Viegas, Oriol Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, Xiaoqiang Zheng - 2015
11 papers in library cite
Kyunghyun Cho, B. V. Merrienboer, D. Bahdanau, Yoshua Bengio - 2014
9 papers in library cite
G. Dahl, D. Yu, L. Deng, Alex Acero - 2012
19 papers in library cite
Alex Graves, G. Wayne, Ivo Danihelka - 2014
18 papers in library cite
K. Gregor, Ivo Danihelka, Alex Graves, D. J. Rezende, Daan Wierstra - 2015
5 papers in library cite
N. Kalchbrenner, Phil Blunsom - 2013
27 papers in library cite
Geoffrey Hinton - 2015
9 papers in library cite
Armand Joulin, Tomas Mikolov - 2015
9 papers in library cite
Lukasz Kaiser, Ilya Sutskever - 2016
5 papers in library cite
A. Lavin - 2015
3 papers in library cite
Zhuowen Tu, Z. L. Lu, Yibo Liu, Xiaodong Liu, H. Li - 2016
4 papers in library cite
N. Kalchbrenner, Ivo Danihelka, Alex Graves - 2016
3 papers in library cite
D. J. Rezende, S. Mohamed, Ivo Danihelka, K. Gregor, Daan Wierstra - 2016
2 papers in library cite
K. Gregor, F. Besse, D. J. Rezende, Ivo Danihelka, Daan Wierstra - 2016
1 paper in library cites
Fanqing Meng, Z. L. Lu, Mingliang Wang, H. Li, W. Jiang, Qian Liu - 2015
3 papers in library cite
Q. Liao, T. Poggio - 2016
2 papers in library cite
X. Shi, Ziru Chen, Haiming Wang, D. Y. Yeung, W. K. Wong, W. C. Woo - 2015
2 papers in library cite
G. Toderici, S. M. O'malley, S. J. Hwang, D. Vincent, D. Minnen, S. Baluja, M. Covell, R. Sukthankar - 2016
2 papers in library cite
Cited by
2
papers in your library
Cites
23
papers in your library
Read
on August 4, 2025
Your review
Tags
Paper Aliases
No aliases