2019

Energy and Policy Considerations for Deep Learning in NLP

E. Strubell, A. Ganesh, Andrew Mccallum

citations

Cite Score

76

AI summary

This paper analyzes the financial and environmental costs of training various NLP models, finding that the cost of tuning a model for a new dataset can be extremely expensive. It recommends reporting training time and sensitivity to hyperparameters, as well as prioritizing computationally efficient hardware and algorithms.

Main Contributions

  • Quantifies the financial and environmental costs of training and developing various NLP models.
  • Analyzes the energy consumption of different NLP models and provides a comparison with familiar consumption metrics.
  • Proposes actionable recommendations to reduce costs and improve equity in NLP research and practice.
  • Highlights the need for reporting training time and sensitivity to hyperparameters for NLP models.
  • Emphasizes the importance of equitable access to computational resources for academic researchers.

Abstract

Recent progress in hardware and methodology for training neural networks has ushered in a new generation of large networks trained on abundant data. These models have obtained notable gains in accuracy across many NLP tasks. However, these accuracy improvements depend on the availability of exceptionally large computational resources that necessitate similarly substantial energy consumption. As a result these models are costly to train and develop, both financially, due to the cost of hardware and electricity or cloud compute time, and environmentally, due to the carbon footprint required to fuel modern tensor processing hardware. In this paper we bring this issue to the attention of NLP researchers by quantifying the approximate financial and environmental costs of training a variety of recently successful neural network models for NLP. Based on these findings, we propose actionable recommendations to reduce costs and improve equity in NLP research and practice.

Citation Graph

Loading graph...

References [19]

Sort:
Filter:

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin - 2017

47 papers in library cite

Jacob Devlin, M. W. Chang, K. Lee, Kristina Toutanova - 2018

39 papers in library cite

D. Bahdanau, Kyunghyun Cho, Yoshua Bengio - 2014

59 papers in library cite

Alec Radford, Jeffrey Wu, Rewon Child, D. Luan, Dario Amodei, Ilya Sutskever - 2019

27 papers in library cite

M. E. Peters, M. Neumann, M. Iyyer, Matt Gardner, C. Clark, K. Lee, L. S. Zettlemoyer - 2018

27 papers in library cite

James Bergstra, Yoshua Bengio - 2012

7 papers in library cite

T. Luong, H. Pham, Christopher D. Manning - 2015

15 papers in library cite

J. Snoek, Hugo Larochelle, R. P. Adams - 2012

9 papers in library cite

J. S. Bergstra, R. Bardenet, Yoshua Bengio, B. Kegl - 2011

3 papers in library cite

A. Canziani, A. Paszke, E. Culurciello - 2017

2 papers in library cite

T. Dozat, Christopher D. Manning - 2017

2 papers in library cite

E. Strubell, P. Verga, D. Andor, D. Weiss, Andrew Mccallum - 2018

2 papers in library cite

C. Forster, T. Johnsen, S. Mandava, S. T. Sreenivas, D. Fu, J. Bernauer, A. Gray, S. Chetlur, Raul Puri - 2019

1 paper in library cites

G. Cook, Jaehoon Lee, T. Tsai, A. Kongn, J. Deans, B. Johnson, E. Jardim, B. Johnson - 2017

1 paper in library cites

Epa - 2018

1 paper in library cites

Dustin Li, X. Chen, M. Becchi, Z. Zong - 2016

1 paper in library cites

B. Burger - 2019

1 paper in library cites

D. R. So, C. Liang, Quoc V. Le - 2019

1 paper in library cites

R. Ascierto - 2018

1 paper in library cites

Cited by

3

papers in your library

Cites

7

papers in your library

Read

on November 23, 2025

Your review

Tags

Paper Aliases

No aliases