2016

Faster R-CNN: Towards Real-Time Object Detection With Region Proposal Networks

Jian Sun

citations

Cite Score

97

AI summary

This paper introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, enabling nearly cost-free region proposals. The RPN is trained end-to-end and achieves state-of-the-art object detection accuracy on PASCAL VOC and MS COCO datasets with only 300 proposals per image.

Main Contributions

  • Introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, enabling nearly cost-free region proposals.
  • RPN is a fully convolutional network that simultaneously predicts object bounds and objectness scores at each position.
  • RPN is trained end-to-end to generate high-quality region proposals, which are used by Fast R-CNN for detection.
  • Achieves state-of-the-art object detection accuracy on PASCAL VOC 2007, 2012, and MS COCO datasets with only 300 proposals per image.
  • Faster R-CNN and RPN are the foundations of the 1st-place winning entries in several tracks of ILSVRC and COCO 2015 competitions.

Abstract

State-of-the-art object detection networks depend on region proposal algorithms to hypothesize object locations. Advances like SPPnet [1] and Fast R-CNN [2] have reduced the running time of these detection networks, exposing region proposal computation as a bottleneck. In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. An RPN is a fully convolutional network that simultaneously predicts object bounds and objectness scores at each position. The RPN is trained end-to-end to generate high-quality region proposals, which are used by Fast R-CNN for detection. We further merge RPN and Fast R-CNN into a single network by sharing their convolutional features-using the recently popular terminology of neural networks with "attention" mechanisms, the RPN component tells the unified network where to look. For the very deep VGG-16 model [3], our detection system has a frame rate of 5fps (including all steps) on a GPU, while achieving state-of-the-art object detection accuracy on PASCAL VOC 2007, 2012, and MS COCO datasets with only 300 proposals per image. In ILSVRC and COCO 2015 competitions, Faster R-CNN and RPN are the foundations of the 1st-place winning entries in several tracks. Code has been made publicly available.

Citation Graph

Loading graph...

References [39]

Sort:
Filter:

K. He, X. Zhang, S. Ren, Jian Sun - 2016

20 papers in library cite

K. Simonyan, Andrew Zisserman - 2014

20 papers in library cite

Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton - 2012

71 papers in library cite

Christian Szegedy, Weizhou Liu, Y. Jia, P. Sermanet, S. Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich - 2015

20 papers in library cite

T. Y. Lin, M. Maire, S. Belongie, James Hays, Pietro Perona, D. Ramanan, Piotr Dollar, C. L. Zitnick - 2014

14 papers in library cite

J. Long, E. Shelhamer, Trevor Darrell - 2015

7 papers in library cite

Jian Sun - 2016

2 papers in library cite

Ross Girshick, J. Donahue, Trevor Darrell, Jitendra Malik - 2014

18 papers in library cite

Ross Girshick - 2015

2 papers in library cite

V. Nair, Geoffrey E. Hinton - 2010

18 papers in library cite

Matthew D. Zeiler, Rob Fergus - 2014

15 papers in library cite

Yann Lecun, B. Boser, John S. Denker, D. Henderson, R. E. Howard, W. Hubbard, L. D. Jackal - 1989

24 papers in library cite

Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, Ross Girshick, S. Guadarrama, Trevor Darrell - 2014

12 papers in library cite

K. He, X. Zhang, S. Ren, Jian Sun - 2014

6 papers in library cite

P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, Rob Fergus, Yann Lecun - 2014

16 papers in library cite

O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Zhongqiang Huang, A. Karpathy, A. Khosla, M. Bernstein - 2014

18 papers in library cite

C. L. Zitnick, Piotr Dollar - 2014

2 papers in library cite

P. F. Felzenszwalb, Ross Girshick, D. Mcallester, D. Ramanan - 2010

8 papers in library cite

Mark Everingham, Luc Van Gool, Christopher K. I. Williams, John Winn, Andrew Zisserman - 2007

7 papers in library cite

J. Uijlings, K. V. D. Sande, T. Gevers, A. Smeulders - 2013

6 papers in library cite

Christian Szegedy, A. Toshev, Dumitru Erhan - 2013

4 papers in library cite

Dumitru Erhan, Christian Szegedy, A. Toshev, Dragomir Anguelov - 2014

4 papers in library cite

J. Chorowski, D. Bahdanau, D. Serdyuk, Kyunghyun Cho, Yoshua Bengio - 2015

3 papers in library cite

Josef Dai, K. He, Jian Sun - 2015

2 papers in library cite

J. Carreira, C. Sminchisescu - 2012

2 papers in library cite

B. Alexe, T. Deselaers, V. Ferrari - 2012

2 papers in library cite

P. Arbelaez, J. P. Tuset, J. Barron, F. Marques, Jitendra Malik - 2014

2 papers in library cite

S. Ren, K. He, Ross Girshick, X. Zhang, Jian Sun - 2015

2 papers in library cite

J. H. Hosang, R. Benenson, Piotr Dollar, B. Schiele - 2015

2 papers in library cite

S. Song, Jianxiong Xiao - 2015

1 paper in library cites

Jiacheng Zhu, X. Chen, A. Yuille - 2015

1 paper in library cites

J. Johnson, A. Karpathy, Li Fei Fei - 2015

1 paper in library cites

J. H. Hosang, R. Benenson, B. Schiele - 2014

1 paper in library cites

D. Kislyuk, Yibo Liu, D. Liu, E. Tzeng, Y. Jing - 2015

1 paper in library cites

Josef Dai, K. He, Jian Sun - 2015

1 paper in library cites

P. Pinheiro, Ronan Collobert, Piotr Dollar - 2015

1 paper in library cites

N. Chavali, H. Agrawal, A. Mahendru, D. Batra - 2015

1 paper in library cites

K. Lenc, A. Vedaldi - 2015

1 paper in library cites

Christian Szegedy, S. Reed, Dumitru Erhan, Dragomir Anguelov - 2015

1 paper in library cites

Cited by

2

papers in your library

Cites

17

papers in your library

Read

on October 20, 2025

Your review

Tags

Paper Aliases

No aliases