2009

What Is the Best Multi-Stage Architecture for Object Recognition?

K. Jarrett, Koray Kavukcuoglu, Marc'aurelio Ranzato, Yann Lecun

citations

Cite Score

65

AI summary

This paper explores multi-stage architectures for object recognition using filter banks, non-linear transformations, and feature pooling, evaluating different non-linearities, filter learning methods (random, unsupervised, supervised), and the number of stages, achieving state-of-the-art results on NORB and MNIST datasets.

Main Contributions

  • Showed that using non-linearities that include rectification and local contrast normalization is the single most important ingredient for good accuracy on object recognition benchmarks.
  • Showed that two stages of feature extraction yield better accuracy than one.
  • Showed that a two-stage system with random filters can yield almost 63% recognition rate on Caltech-101, provided that the proper non-linearities and pooling layers are used.
  • Achieved state-of-the-art performance on the NORB dataset (5.6%) with supervised refinement.
  • Achieved the lowest known error rate on the undistorted, unprocessed MNIST dataset (0.53%) with unsupervised pre-training followed by supervised refinement.

Abstract

In many recent object recognition systems, feature extraction stages are generally composed of a filter bank, a non-linear transformation, and some sort of feature pooling layer. Most systems use only one stage of feature extraction in which the filters are hard-wired, or two stages where the filters in one or both stages are learned in supervised or unsupervised mode. This paper addresses three questions: 1. How does the non-linearities that follow the filter banks influence the recognition accuracy? 2. does learning the filter banks in an unsupervised or supervised manner improve the performance over random filters or hard-wired filters? 3. Is there any advantage to using an architecture with two stages of feature extraction, rather than one? We show that using non-linearities that include rectification and local contrast normalization is the single most important ingredient for good accuracy on object recognition benchmarks. We show that two stages of feature extraction yield better accuracy than one. Most surprisingly, we show that a two-stage system with random filters can yield almost 63% recognition rate on Caltech-101, provided that the proper non-linearities and pooling layers are used. Finally, we show that with supervised refinement, the system achieves state-of-the-art performance on NORB dataset (5.6%) and unsupervised pre-training followed by supervised refinement produces good accuracy on Caltech-101 (> 65%), and the lowest known error rate on the undistorted, unprocessed MNIST dataset (0.53%).

Citation Graph

Loading graph...

References [31]

Sort:
Filter:

Yann Lecun, Leon Bottou, Yoshua Bengio, Patrick Haffner - 1998

62 papers in library cite

Geoffrey Hinton, Ruslan Salakhutdinov - 2006

37 papers in library cite

Svetlana Lazebnik, Cordelia Schmid, Jean Ponce - 2006

14 papers in library cite

Yoshua Bengio, P. Lamblin, D. Popovici, Hugo Larochelle - 2006

33 papers in library cite

Li Fei Fei, Rob Fergus, Pietro Perona - 2004

15 papers in library cite

Yann Lecun, Fu Jie Huang, Leon Bottou - 2004

18 papers in library cite

Marc'aurelio Ranzato, C. Poultney, S. Chopra, Yann Lecun - 2006

20 papers in library cite

Marc'aurelio Ranzato, F. Huang, Y. Boureau, Yann Lecun - 2007

8 papers in library cite

Honglak Lee, R. Grosse, R. Ranganath, Andrew Y. Ng - 2009

12 papers in library cite

N. Dalal, B. Triggs - 2005

12 papers in library cite

Bruno A. Olshausen, David J. Field - 1997

10 papers in library cite

Honglak Lee, C. Ekanadham, A. Ng - 2008

10 papers in library cite

D. Lowe - 2004

9 papers in library cite

Jihan Yang, K. Yu, Y. Gong, T. Huang - 2009

8 papers in library cite

A. C. Berg, T. L. Berg, Jitendra Malik - 2005

8 papers in library cite

T. Serre, Lior Wolf, T. Poggio - 2005

7 papers in library cite

Honglak Lee, Alexis Battle, Rajat Raina, A. Ng - 2007

6 papers in library cite

Haowei Zhang, A. C. Berg, M. Maire, Jitendra Malik - 2006

6 papers in library cite

F. Huang, Yann Lecun - 2006

5 papers in library cite

N. Pinto, D. D. Cox, J. J. Dicarlo - 2008

5 papers in library cite

Koray Kavukcuoglu, Marc'aurelio Ranzato, Rob Fergus, Yann Lecun - 2009

4 papers in library cite

Koray Kavukcuoglu, Marc'aurelio Ranzato, Yann Lecun - 2008

3 papers in library cite

J. Mutch, D. Lowe - 2006

3 papers in library cite

S. Lyu, E. Simoncelli - 2008

3 papers in library cite

Marc'aurelio Ranzato, M. Szummer - 2008

2 papers in library cite

Julien Mairal, F. Bach, Jean Ponce, G. Sapiro, Andrew Zisserman - 2008

1 paper in library cites

Missing author listMissing year

1 paper in library cites

M. Aharon, M. Elad, A. Bruckstein - 2005

1 paper in library cites

M. Varma, D. Ray - 2007

1 paper in library cites

A. Ahmed, K. Yu, Weixin Xu, Y. Gong, E. Xing - 2008

1 paper in library cites

Geoffrey Hinton, T. Sejnowski - 1999

1 paper in library cites

Cited by

20

papers in your library

Cites

9

papers in your library

Read

on August 2, 2025

Your review

Tags

Paper Aliases

No aliases