2014
Cite Score
82
AI summary
This paper introduces a large-scale video classification approach using CNNs on the Sports-1M dataset with 1 million YouTube videos across 487 classes, and proposes a multiresolution foveated architecture to speed up training, achieving significant performance gains and demonstrating strong generalization capabilities on the UCF-101 dataset.
Main Contributions
Abstract
Convolutional Neural Networks (CNNs) have been established as a powerful class of models for image recognition problems. Encouraged by these results, we provide an extensive empirical evaluation of CNNs on large-scale video classification using a new dataset of 1 million YouTube videos belonging to 487 classes. We study multiple approaches for extending the connectivity of a CNN in time domain to take advantage of local spatio-temporal information and suggest a multiresolution, foveated architecture as a promising way of speeding up the training. Our best spatio-temporal networks display significant performance improvements compared to strong feature-based baselines (55.3% to 63.9%), but only a surprisingly modest improvement compared to single-frame models (59.3% to 60.9%). We further study the generalization performance of our best model by retraining the top layers on the UCF-101 Action Recognition dataset and observe significant performance improvements compared to the UCF-101 baseline model (63.3% up from 43.9%).
Citation Graph
References [28]
Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton - 2012
71 papers in library cite
J. Deng, W. Dong, Richard Socher, L. J. Li, K. Li, Li Fei Fei - 2009
28 papers in library cite
Yann Lecun, Leon Bottou, Yoshua Bengio, Patrick Haffner - 1998
62 papers in library cite
Ross Girshick, J. Donahue, Trevor Darrell, Jitendra Malik - 2014
18 papers in library cite
Matthew D. Zeiler, Rob Fergus - 2014
15 papers in library cite
Josef Sivic, Andrew Zisserman - 2003
5 papers in library cite
Jeffrey Dean, G. S. Corrado, R. Monga, K. Chen, M. Devin, Quoc V. Le, Mark Z. Mao, Marc'aurelio Ranzato, A. Senior, P. Tucker, K. Yang, Andrew Y. Ng - 2012
16 papers in library cite
P. Sermanet, S. Chintala, Yann Lecun - 2012
6 papers in library cite
Khurram Soomro, Amir Roshan Zamir, Mubarak Shah - 2012
1 paper in library cites
P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, Rob Fergus, Yann Lecun - 2014
16 papers in library cite
A. Razavian, H. Azizpour, J. Sullivan, S. Carlsson - 2014
6 papers in library cite
Clement Farabet, C. Couprie, L. Najman, Yann Lecun - 2013
6 papers in library cite
N. Dalal, B. Triggs - 2005
12 papers in library cite
Quoc Le, W. Zou, S. Y. Yeung, A. Ng - 2011
4 papers in library cite
Graham W. Taylor, Rob Fergus, Yann Lecun, C. Bregler - 2010
3 papers in library cite
Dan C. Ciresan, A. Giusti, Jürgen Schmidhuber - 2012
3 papers in library cite
J. C. Niebles, C. W. Chen, Li Fei Fei - 2010
2 papers in library cite
Joseph Liu, J. Luo, Mubarak Shah - 2009
2 papers in library cite
S. Ji, Weixin Xu, Michael Yang, K. Yu - 2013
1 paper in library cites
M. Varma, Andrew Zisserman - 2005
1 paper in library cites
Haiming Wang, A. Klaser, Cordelia Schmid, C. L. Liu - 2011
1 paper in library cites
Piotr Dollar, V. Rabaud, G. Cottrell, S. Belongie - 2005
1 paper in library cites
W. Yang, G. Toderici - 2011
1 paper in library cites
Haiming Wang, M. M. Ullah, A. Klaser, I. Laptev, Cordelia Schmid - 2009
1 paper in library cites
C. Couprie, Clement Farabet, L. Najman, Yann Lecun - 2013
1 paper in library cites
I. Laptev, M. Marszalek, Cordelia Schmid, B. Rozenfeld - 2008
1 paper in library cites
I. Laptev - 2005
1 paper in library cites
M. Baccouche, F. Mamalet, C. Wolf, C. Garcia, A. Baskurt - 2011
1 paper in library cites
Cited by
2
papers in your library
Cites
12
papers in your library
Read
on August 3, 2025
Your review
Tags
Paper Aliases
No aliases