2014

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Ross Girshick, J. Donahue, Trevor Darrell, Jitendra Malik

citations

Cite Score

96

AI summary

This paper introduces R-CNN, a detection algorithm combining convolutional neural networks (CNNs) with bottom-up region proposals, achieving a 30% improvement in mean average precision (mAP) on PASCAL VOC 2012 and outperforming OverFeat on the ILSVRC2013 dataset.

Main Contributions

  • Proposed R-CNN, a simple and scalable detection algorithm.
  • Combined high-capacity convolutional neural networks (CNNs) with bottom-up region proposals for object localization and segmentation.
  • Demonstrated that supervised pre-training on an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost when labeled training data is scarce.
  • Achieved a mean average precision (mAP) of 53.3% on PASCAL VOC 2012, a 30% improvement over the previous best result.
  • Outperformed OverFeat by a large margin on the 200-class ILSVRC2013 detection dataset.

Abstract

Object detection performance, as measured on the canonical PASCAL VOC dataset, has plateaued in the last few years. The best-performing methods are complex ensemble systems that typically combine multiple low-level image features with high-level context. In this paper, we propose a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012-achieving a mAP of 53.3%. Our approach combines two key insights: (1) one can apply high-capacity convolutional neural networks (CNNs) to bottom-up region proposals in order to localize and segment objects and (2) when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost. Since we combine region proposals with CNNs, we call our method R-CNN: Regions with CNN features. We also compare R-CNN to OverFeat, a recently proposed sliding-window detector based on a similar CNN architecture. We find that R-CNN outperforms OverFeat by a large margin on the 200-class ILSVRC2013 detection dataset. Source code for the complete system is available at http://www.cs.berkeley.edu/~rbg/rcnn.

Citation Graph

Loading graph...

References [40]

Sort:
Filter:

K. Simonyan, Andrew Zisserman - 2014

20 papers in library cite

Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton - 2012

71 papers in library cite

J. Deng, W. Dong, Richard Socher, L. J. Li, K. Li, Li Fei Fei - 2009

28 papers in library cite

Yann Lecun, Leon Bottou, Yoshua Bengio, Patrick Haffner - 1998

62 papers in library cite

D. E. Rumelhart, Geoffrey E. Hinton, Ronald J. Williams - 1986

46 papers in library cite

Mark Everingham, Luc Van Gool, Christopher K. I. Williams, John Winn, Andrew Zisserman - 2010

7 papers in library cite

Yann Lecun, B. Boser, John S. Denker, D. Henderson, R. E. Howard, W. Hubbard, L. D. Jackal - 1989

24 papers in library cite

Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, Ross Girshick, S. Guadarrama, Trevor Darrell - 2014

12 papers in library cite

H. Rowley, S. Baluja, Takeo Kanade - 1998

4 papers in library cite

J. Donahue, Y. Jia, Oriol Vinyals, J. Hoffman, N. Zhang, E. Tzeng, Trevor Darrell - 2014

15 papers in library cite

M. Zeiler, G. Taylor, Rob Fergus - 2011

2 papers in library cite

P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, Rob Fergus, Yann Lecun - 2014

16 papers in library cite

Clement Farabet, C. Couprie, L. Najman, Yann Lecun - 2013

6 papers in library cite

Missing year

J. Deng, A. Berg, S. Satheesh, H. Su, A. Khosla, Li Fei Fei

1 paper in library cites

N. Dalal, B. Triggs - 2005

12 papers in library cite

D. Lowe - 2004

9 papers in library cite

P. F. Felzenszwalb, Ross Girshick, D. Mcallester, D. Ramanan - 2010

8 papers in library cite

Aude Oliva, Antonio Torralba - 2001

7 papers in library cite

J. Uijlings, K. V. D. Sande, T. Gevers, A. Smeulders - 2013

6 papers in library cite

Christian Szegedy, A. Toshev, Dumitru Erhan - 2013

4 papers in library cite

D. Hoiem, Y. Chodpathumwan, Q. Dai - 2012

4 papers in library cite

R. Vaillant, C. Monrocq, Yann Lecun - 1994

4 papers in library cite

P. Sermanet, Koray Kavukcuoglu, S. Chintala, Yann Lecun - 2013

4 papers in library cite

J. Carreira, R. Caseiro, J. Batista, C. Sminchisescu - 2012

4 papers in library cite

K. Sung, T. Poggio - 1994

3 papers in library cite

I. Endres, D. Hoiem - 2010

2 papers in library cite

J. Carreira, C. Sminchisescu - 2012

2 papers in library cite

M. Douze, Hervé Jégou, H. Sandhawalia, L. Amsaleg, Cordelia Schmid - 2009

2 papers in library cite

Xiang Ren, D. Ramanan - 2013

2 papers in library cite

B. Alexe, T. Deselaers, V. Ferrari - 2012

2 papers in library cite

P. Arbelaez, J. P. Tuset, J. Barron, F. Marques, Jitendra Malik - 2014

2 papers in library cite

Xinpeng Wang, Michael Yang, S. Zhu, Yutong Lin - 2013

2 papers in library cite

J. Deng, O. Russakovsky, J. Krause, M. Bernstein, A. C. Berg, Li Fei Fei - 2014

2 papers in library cite

Sanja Fidler, R. Mottaghi, A. Yuille, R. Urtasun - 2013

1 paper in library cites

H. Su, J. Deng, Li Fei Fei - 2012

1 paper in library cites

T. Dean, M. A. Ruzon, M. Segal, J. Shlens, S. Vijayanarasimhan, J. Yagnik - 2013

1 paper in library cites

Dan C. Ciresan, A. Giusti, L. Gambardella, Jürgen Schmidhuber - 2013

1 paper in library cites

P. Arbelaez, B. Hariharan, C. Gu, S. Gupta, L. Bourdev, Jitendra Malik - 2012

1 paper in library cites

J. Lim, C. Zitnick, Piotr Dollar - 2013

1 paper in library cites

Cited by

18

papers in your library

Cites

15

papers in your library

Read

on July 31, 2025

Your review

Tags

Paper Aliases

No aliases