Papperoni

2018

Do CIFAR-10 Classifiers Generalize to CIFAR-10?

Vaishaal Shankar

Open PDF Google Scholar

citations

Cite Score

24

AI summary

This paper introduces a new CIFAR-10 dataset to measure the generalization capability of image classification models, and finds a significant drop in accuracy (4%-10%) for a broad range of deep learning models, indicating that current accuracy numbers are brittle and susceptible to even minute natural variations in the data distribution.

Main Contributions

Introduces a new test set for CIFAR-10 that contains truly unseen images.
Evaluates the performance of 30 image classification models on the new test set, and shows a significant drop in accuracy (4% to 10%).
Shows that current accuracy numbers are brittle and susceptible to even minute natural variations in the data distribution.
Shows that the best performing models on the new test set see an increased advantage over more established baselines.
Indicates that current CIFAR-10 classifiers have difficulty generalizing to natural variations in image data.

Abstract

Machine learning is currently dominated by largely experimental work focused on improvements in a few key tasks. However, the impressive accuracy numbers of the best performing models are questionable because the same test sets have been used to select these models for multiple years now. To understand the danger of overfitting, we measure the accuracy of CIFAR-10 classifiers by creating a new test set of truly unseen images. Although we ensure that the new test set is as close to the original data distribution as possible, we find a large drop in accuracy (4% to 10%) for a broad range of deep learning models. Yet, more recent models with higher original accuracy show a smaller drop and better overall performance, indicating that this drop is likely not due to overfitting based on adaptivity. Instead, we view our results as evidence that current accuracy numbers are brittle and susceptible to even minute natural variations in the data distribution.

Citation Graph

Loading graph...

References [22]

Sort:

Filter:

[1]Deep Residual Learning for Image Recognition

K. He, X. Zhang, S. Ren, Jian Sun - 2016

20 papers in library cite

This is simply amazing. Very very simple idea, totally revolutionary. No maths, just "it works!". Amazing.

[2]Very Deep Convolutional Networks for Large-Scale Image Recognition

K. Simonyan, Andrew Zisserman - 2014

20 papers in library cite

This is very good! The great thing here is small filters and depth analysis, but truly they do some other stuff as well: SotA, generalization for other tasks, and open source their models. Very nice.

[3]ImageNet Classification With Deep Convolutional Neural Networks

Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton - 2012

71 papers in library cite

I'm giving this a 5 just because of the impact, but this is VEEERY derivative of earlier work. Kudos for them for putting it all together, but really there's nothing revolutionary here.

[4]Densely Connected Convolutional Networks

G. Huang, Ze Liu, K. Weinberger, Laurens Van Der Maaten - 2017

5 papers in library cite

I liked this paper so much! The way that it's written makes it very easy to follow. Results are nice, explanations are intuitive. Very nice!

[5]Learning Multiple Layers of Features From Tiny Images

Alex Krizhevsky - 2009

27 papers in library cite

It's alright. It mainly focuses on RBMs and their features and the actual part that describes the dataset is like 1 page. However, it's maybe the best intuitive description of an RBM I have seen. Other than that, it reads very much like an undergraduate thesis.

[6]WordNet: A Lexical Database for English

G. Miller - 1995

5 papers in library cite

Meh. It seems like it was publish so that people could cite the dataset somehow. Nothing interesting, but quick read and very used.

[7]Identity Mappings in Deep Residual Networks

K. He, X. Zhang, S. Ren, Jian Sun - 2016

4 papers in library cite

This paper is amazing - simple, easy to follow, intuitive, while also being impactful! ResNets are already a big leap forward, and they can improve on that

[8]Improving Neural Networks by Preventing Co-Adaptation of Feature Detectors

Geoffrey E. Hinton, N. Srivastava, Alex Krizhevsky, Ilya Sutskever, Ruslan R. Salakhutdinov - 2012

25 papers in library cite

Dropout, super impactful. The idea that you are training many estimators at once is also very nice.

[9]80 Million Tiny Images: A Large Dataset for Non-Parametric Object and Scene Recognition

Antonio Torralba, Rob Fergus, W. Freeman - 2008

8 papers in library cite

The initial part about data collection and dataset description was nice, but the part of classifying was a bit overkill

[10]Aggregated Residual Transformations for Deep Neural Networks

Saining Xie, Ross Girshick, Piotr Dollar, Zhuowen Tu, K. He - 2017

3 papers in library cite

SOTA vision

[11]Wide Residual Networks

S. Zagoruyko, N. Komodakis - 2016

5 papers in library cite

Turn networks wide vs. deep

[12]Learning Transferable Architectures for Scalable Image Recognition

Barret Zoph, V. Vasudevan, J. Shlens, Quoc Le - 2017

2 papers in library cite

[13]Improved Regularization of Convolutional Neural Networks With Cutout

T. Devries, G. Taylor - 2017

1 paper in library cites

Regularization for CNNs (masking parts of the image)

[14]Shake-Shake Regularization

X. Gastaldi - 2017

2 papers in library cite

SotA Imagenet

[15]The Ladder: A Reliable Leaderboard for Machine Learning Competitions

A. Blum, Moritz Hardt - 2015

2 papers in library cite

Seems like a better leaderboard

[16]An Analysis of Single-Layer Networks in Unsupervised Feature Learning

A. Coates, A. Ng, Honglak Lee - 2011

7 papers in library cite

[17]Deep Pyramidal Residual Networks

D. Han, Jeremy Kim, Jeremy Kim - 2017

3 papers in library cite

[18]Regularized Evolution for Image classifier Architecture Search

Y. H. Q. V. Le, E. Real, A. Aggarwal - 2018

3 papers in library cite

[19]Weighted Sums of Random Kitchen Sinks: Replacing Minimization With Randomization in Learning

A. Rahimi, Benjamin Recht - 2009

2 papers in library cite

[20]Generalization in Deep Learning

K. Kawaguchi, L. Kaelbling, Yoshua Bengio - 2017

1 paper in library cites

[21]Shakedrop Regularization

Y. Yamada, M. Iwamura, K. Kise - 2018

1 paper in library cites

[22]Very Deep Convolutional Neural Network Based Image Classification Using Small Training Sample Size

Shuming Liu, W. Deng - 2015

1 paper in library cites

Cited by

2

papers in your library

Cites

15

papers in your library

Read

on November 10, 2025

It's a very nice analysis and it's interesting it took so long for people to see these problems.

Tags

Paper Aliases

No aliases