2014

One Weird Trick for Parallelizing Convolutional Neural Networks

Alex Krizhevsky

citations

Cite Score

50

AI summary

This paper introduces a novel parallelization technique for training convolutional neural networks across multiple GPUs, leveraging data parallelism for convolutional layers and model parallelism for fully-connected layers, achieving better scaling than existing alternatives.

Main Contributions

  • Introduces a hybrid parallelization strategy combining data parallelism for convolutional layers and model parallelism for fully-connected layers.
  • Presents three schemes for implementing model parallelism in fully-connected layers, analyzing their communication costs and suitability for different hardware configurations.
  • Shows that variable batch sizes, with smaller batches for fully-connected layers, can lead to faster convergence and better minima.
  • Reports experimental results on ImageNet 2012 demonstrating good scaling with the proposed parallelization scheme.
  • Discusses the accuracy cost with large batch sizes and how it can be reduced using the variable batch size technique.

Abstract

I present a new way to parallelize the training of convolutional neural networks across multiple GPUs. The method scales significantly better than all alternatives when applied to modern convolutional neural networks.

Citation Graph

Loading graph...

References [7]

Sort:
Filter:

Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton - 2012

71 papers in library cite

J. Deng, W. Dong, Richard Socher, L. J. Li, K. Li, Li Fei Fei - 2009

28 papers in library cite

Jeffrey Dean, G. S. Corrado, R. Monga, K. Chen, M. Devin, Quoc V. Le, Mark Z. Mao, Marc'aurelio Ranzato, A. Senior, P. Tucker, K. Yang, Andrew Y. Ng - 2012

16 papers in library cite

Benjamin Recht, C. Re, S. Wright, F. Niu - 2011

6 papers in library cite

A. Coates, B. Huval, Tianle Wang, D. Wu, Bryan Catanzaro, N. Andrew - 2013

2 papers in library cite

T. Paine, H. Jin, Jihan Yang, Zongyu Lin, T. Huang - 2013

1 paper in library cites

O. Yadan, K. Adams, Y. Taigman, Marc'aurelio Ranzato - 2013

1 paper in library cites

Cited by

3

papers in your library

Cites

3

papers in your library

Read

on July 25, 2025

Your review

Tags

Paper Aliases

No aliases