2018

Universal Language Model Fine-Tuning for Text Classification

J. Howard, Sebastian Ruder

citations

Cite Score

76

AI summary

This paper introduces ULMFiT, a transfer learning method that leverages discriminative fine-tuning, slanted triangular learning rates, and gradual unfreezing, achieving state-of-the-art results on six text classification tasks and demonstrating sample efficiency with limited labeled data, using a 3-layer LSTM architecture.

Main Contributions

  • Introduces Universal Language Model Fine-tuning (ULMFiT) for NLP tasks, enabling CV-like transfer learning.
  • Proposes discriminative fine-tuning, slanted triangular learning rates, and gradual unfreezing to retain knowledge and avoid catastrophic forgetting.
  • Achieves state-of-the-art results on six text classification datasets, reducing error by 18-24% on most datasets.
  • Demonstrates sample-efficient transfer learning with extensive ablation analysis.
  • Releases pretrained models and code for wider adoption.

Abstract

Inductive transfer learning has greatly impacted computer vision, but existing approaches in NLP still require task-specific modifications and training from scratch. We propose Universal Language Model Fine-tuning (ULMFiT), an effective transfer learning method that can be applied to any task in NLP, and introduce techniques that are key for fine-tuning a language model. Our method significantly outperforms the state-of-the-art on six text classification tasks, reducing the error by 18-24% on the majority of datasets. Furthermore, with only 100 labeled examples, it matches the performance of training from scratch on 100× more data. We open-source our pretrained models and code¹.

Citation Graph

Loading graph...

References [52]

Sort:
Filter:

S. Ioffe, Christian Szegedy - 2015

18 papers in library cite

J. Long, E. Shelhamer, Trevor Darrell - 2015

7 papers in library cite

G. Huang, Ze Liu, K. Weinberger, Laurens Van Der Maaten - 2017

5 papers in library cite

Sinno Jialin Pan, Qiang Yang - 2010

1 paper in library cites

M. E. Peters, M. Neumann, M. Iyyer, Matt Gardner, C. Clark, K. Lee, L. S. Zettlemoyer - 2018

27 papers in library cite

Hod Lipson - 2014

2 papers in library cite

Frank Hutter - 2017

4 papers in library cite

A. L. Maas, R. E. Daly, P. T. Pham, Dong Huang, Andrew Y. Ng, Christopher Potts - 2011

12 papers in library cite

J. Donahue, Y. Jia, Oriol Vinyals, J. Hoffman, N. Zhang, E. Tzeng, Trevor Darrell - 2014

15 papers in library cite

Dumitru Erhan, Yoshua Bengio, Aaron Courville, Pierre Antoine Manzagol, Pascal Vincent, Samy Bengio - 2010

12 papers in library cite

S. Merity, Caiming Xiong, J. Bradbury, Richard Socher - 2017

12 papers in library cite

John Blitzer, Mark Dredze, Fernando Pereira - 2007

4 papers in library cite

Alexis Conneau, Douwe Kiela, Holger Schwenk, L. Barrault, Antoine Bordes - 2017

11 papers in library cite

A. M. Dai, Quoc V. Le - 2015

27 papers in library cite

B. Mccann, J. Bradbury, Caiming Xiong, Richard Socher - 2017

14 papers in library cite

M. E. Peters, W. Ammar, C. Bhagavatula, Russell Power - 2017

5 papers in library cite

Alec Radford, R. Jozefowicz, Ilya Sutskever - 2017

8 papers in library cite

X. Zhang, J. Zhao, Yann Lecun - 2015

7 papers in library cite

M. Long, Yue Cao, J. Wang, Michael I. Jordan - 2015

1 paper in library cites

A. Razavian, H. Azizpour, J. Sullivan, S. Carlsson - 2014

6 papers in library cite

L. S. Smith - 2016

3 papers in library cite

R. Sennrich, B. Haddow, Alexandra Birch - 2016

4 papers in library cite

Ruslan Salakhutdinov, Geoffrey E. Hinton - 2009

9 papers in library cite

D. Mahajan, Ross Girshick, V. Ramanathan, K. He, M. Paluri, Yiwei Li, A. Bharambe, Laurens Van Der Maaten - 2018

2 papers in library cite

S. Merity, Nitish Shirish Keskar, Richard Socher - 2017

6 papers in library cite

R. Johnson, Tong Zhang - 2017

2 papers in library cite

B. Felbo, A. Mislove, A. Sogaard, I. Rahwan, S. Lehmann - 2017

1 paper in library cites

M. Huh, P. Agrawal, A. A. Efros - 2016

1 paper in library cites

L. Mou, Z. Meng, R. Yan, G. Li, Yiheng Xu, Li Zhang, Z. Jin - 2016

3 papers in library cite

Tal Linzen, E. Dupoux, Y. Goldberg - 2016

5 papers in library cite

E. M. Voorhees, D. M. Tice - 1999

5 papers in library cite

T. Miyato, A. M. Dai, I. Goodfellow - 2016

4 papers in library cite

S. Min, M. J. Seo, Hananneh Hajishirzi - 2017

4 papers in library cite

K. Gulordava, Piotr Bojanowski, E. Grave, Tal Linzen, M. Baroni - 2018

3 papers in library cite

L. Mou, H. Peng, G. Li, Yiheng Xu, Li Zhang, Z. Jin - 2015

3 papers in library cite

Rich Caruana - 1993

3 papers in library cite

M. Rei - 2017

3 papers in library cite

P. Zhou, Z. Qi, S. Zheng, Jiacheng Xu, H. Bao, B. Xu - 2016

3 papers in library cite

J. Baxter - 2000

2 papers in library cite

T. Dozat, Christopher D. Manning - 2017

2 papers in library cite

V. N. Vapnik - 1983

2 papers in library cite

B. Hariharan, P. Arbelaez, Ross Girshick, Jitendra Malik - 2015

2 papers in library cite

R. Johnson, Tong Zhang - 2016

2 papers in library cite

Sebastian Ruder - 2016

1 paper in library cites

C. Caragea, N. Mcneese, A. Jaiswal, G. Traylor, H. W. Kim, P. Mitra, D. Wu, A. H. Tapia, L. Giles, B. J. Jansen - 2011

1 paper in library cites

Z. Chu, S. Gianvecchio, Haiming Wang, S. Jajodia - 2012

1 paper in library cites

H. L. Roitblat, A. Kershaw, P. Oot - 2010

1 paper in library cites

L. Liu, J. Shang, Frank Xu, Xiang Ren, H. Gui, J. Peng, J. Han - 2018

1 paper in library cites

Ziru Chen, V. Badrinarayanan, Chen Yu Lee, Andrew Rabinovich - 2017

1 paper in library cites

N. Jindal, Bing Liu - 2007

1 paper in library cites

J. Wieting, Kevin Gimpel - 2017

1 paper in library cites

A. Severyn, A. Moschitti - 2015

1 paper in library cites

Cited by

14

papers in your library

Cites

29

papers in your library

Read

on October 17, 2025

Your review

Tags

Paper Aliases

No aliases