Cite Score
74
AI summary
This paper introduces a recurrent neural network (RNN) model, the Recurrent Attention Model (RAM), that uses reinforcement learning to extract information from images by adaptively selecting a sequence of regions, outperforming a convolutional neural network baseline on cluttered image classification tasks and learning to track a simple object on a dynamic visual control problem.
Main Contributions
Abstract
Applying convolutional neural networks to large images is computationally ex-pensive because the amount of computation scales linearly with the number of image pixels. We present a novel recurrent neural network model that is ca-pable of extracting information from an image or video by adaptively selecting a sequence of regions or locations and only processing the selected regions at high resolution. Like convolutional neural networks, the proposed model has a degree of translation invariance built-in, but the amount of computation it per-forms can be controlled independently of the input image size. While the model is non-differentiable, it can be trained using reinforcement learning methods to learn task-specific policies. We evaluate our model on several image classification tasks, where it significantly outperforms a convolutional neural network baseline on cluttered images, and on a dynamic visual control problem, where it learns to track a simple object without an explicit training signal for doing so.
Citation Graph
References [26]
Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton - 2012
71 papers in library cite
Sepp Hochreiter, Jürgen Schmidhuber - 1997
94 papers in library cite
Ross Girshick, J. Donahue, Trevor Darrell, Jitendra Malik - 2014
18 papers in library cite
James Bergstra, Yoshua Bengio - 2012
7 papers in library cite
R. Williams - 1992
11 papers in library cite
P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, Rob Fergus, Yann Lecun - 2014
16 papers in library cite
Marc'aurelio Ranzato - 2014
3 papers in library cite
P. Viola, M. J. Jones - 2001
10 papers in library cite
Hugo Larochelle, Geoffrey E. Hinton - 2010
4 papers in library cite
M. Denil, L. Bazzani, Hugo Larochelle, N. D. Freitas - 2012
4 papers in library cite
K. E. A. V. D. Sande, J. R. R. Uijlings, T. Gevers, A. W. M. Smeulders - 2011
3 papers in library cite
C. H. Lampert, M. B. Blaschko, T. Hofmann - 2008
2 papers in library cite
Richard S. Sutton, D. Mcallester, Shivalika Singh, Y. Mansour - 2000
2 papers in library cite
B. Alexe, N. Heess, Yee Whye Teh, V. Ferrari - 2012
2 papers in library cite
R. A. Rensink - 2000
2 papers in library cite
B. Alexe, T. Deselaers, V. Ferrari - 2010
2 papers in library cite
L. Itti, C. Koch, E. Niebur - 1998
1 paper in library cites
S. Mathe, C. Sminchisescu - 2013
1 paper in library cites
P. F. Felzenszwalb, R. B. Girshick, D. A. Mcallester - 2010
1 paper in library cites
Antonio Torralba, Aude Oliva, M. S. Castelhano, J. M. Henderson - 2006
1 paper in library cites
K. O. Stanley, R. Miikkulainen - 2004
1 paper in library cites
M. Hayhoe, D. Ballard - 2005
1 paper in library cites
N. J. Butko, J. R. Movellan - 2008
1 paper in library cites
N. J. Butko, J. R. Movellan - 2009
1 paper in library cites
L. Paletta, G. Fritz, C. Seifert - 2005
1 paper in library cites
Daan Wierstra, A. Foerster, J. Peters, Jürgen Schmidhuber - 2007
1 paper in library cites
Cited by
5
papers in your library
Cites
7
papers in your library
Read
on August 2, 2025
Your review
Tags
Paper Aliases
No aliases