2015

Semi-Supervised Sequence Learning

A. M. Dai, Quoc V. Le

citations

Cite Score

51

AI summary

This paper introduces two semi-supervised learning approaches for sequence learning with recurrent networks: predicting the next sequence element and sequence autoencoders. LSTMs pretrained with these approaches show improved stability and generalization, achieving strong performance on text classification tasks such as IMDB, DBpedia and 20 Newsgroups.

Main Contributions

  • Introduces a sequence autoencoder approach to semi-supervised learning for sequence learning with recurrent networks.
  • Demonstrates that pretraining LSTMs with sequence autoencoders improves stability and generalization.
  • Shows that LSTMs pretrained with sequence autoencoders achieve strong performance on text classification tasks.
  • Shows that using more unlabeled data from related tasks in the pretraining can improve the generalization of a subsequent supervised model.
  • Achieves state-of-the-art results on IMDB, DBpedia and 20 Newsgroups text classification datasets.

Abstract

We present two approaches that use unlabeled data to improve sequence learning with recurrent networks. The first approach is to predict what comes next in a sequence, which is a conventional language model in natural language processing. The second approach is to use a sequence autoencoder, which reads the input sequence into a vector and predicts the input sequence again. These two algorithms can be used as a “pretraining” step for a later supervised sequence learning algorithm. In other words, the parameters obtained from the unsupervised step can be used as a starting point for other supervised training models. In our experiments, we find that long short term memory recurrent networks after being pretrained with the two approaches are more stable and generalize better. With pretraining, we are able to train long short term memory recurrent networks up to a few hundred timesteps, thereby achieving strong performance in many text classification tasks, such as IMDB, DBpedia and 20 Newsgroups.

Citation Graph

Loading graph...

References [38]

Sort:
Filter:

Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton - 2012

71 papers in library cite

Sepp Hochreiter, Jürgen Schmidhuber - 1997

94 papers in library cite

D. E. Rumelhart, Geoffrey E. Hinton, Ronald J. Williams - 1986

34 papers in library cite

Ilya Sutskever, Oriol Vinyals, Quoc V. Le - 2014

58 papers in library cite

Yoon Kim - 2014

8 papers in library cite

Quoc Le, Tomas Mikolov - 2014

13 papers in library cite

Richard Socher, A. Perelygin, Jeffrey Wu, J. Chuang, C. Manning, A. Ng, Christopher Potts - 2013

24 papers in library cite

Felix A. Gers, Jürgen Schmidhuber, Fred Cummins - 2000

13 papers in library cite

Dumitru Erhan - 2015

11 papers in library cite

K. Greff, R. K. Srivastava, J. Koutn'ik, B. R. Steunebrink, Jürgen Schmidhuber - 2015

4 papers in library cite

Tomas Mikolov, M. Karafiat, Lukas Burget, Jan Cernocky, Sanjeev Khudanpur - 2010

36 papers in library cite

A. L. Maas, R. E. Daly, P. T. Pham, Dong Huang, Andrew Y. Ng, Christopher Potts - 2011

12 papers in library cite

Wojciech Zaremba, Ilya Sutskever, Oriol Vinyals - 2014

22 papers in library cite

R. Kiros, Yuxuan Zhu, Ruslan Salakhutdinov, Richard S. Zemel, R. Urtasun, Antonio Torralba, Sanja Fidler - 2015

23 papers in library cite

Sepp Hochreiter, Yoshua Bengio, Paolo Frasconi, Jürgen Schmidhuber - 2001

16 papers in library cite

Oriol Vinyals, Quoc V. Le - 2015

7 papers in library cite

Rie Kubota Ando, Tong Zhang - 2005

10 papers in library cite

Yoshua Bengio - 2014

12 papers in library cite

Geoffrey Hinton - 2015

9 papers in library cite

T. Luong, Ilya Sutskever, Quoc V. Le, Oriol Vinyals, Wojciech Zaremba - 2014

14 papers in library cite

X. Zhang, J. Zhao, Yann Lecun - 2015

7 papers in library cite

N. Srivastava, E. Mansimov, Ruslan Salakhutdinov - 2015

3 papers in library cite

J. Y. H. Ng, M. J. Hausknecht, S. Vijayanarasimhan, Oriol Vinyals, R. Monga, G. Toderici - 2015

1 paper in library cites

L. Shang, Z. L. Lu, H. Li - 2015

2 papers in library cite

K. R. Johnson, Tong Zhang - 2014

1 paper in library cites

W. Chan, Navdeep Jaitly, Quoc V. Le, Oriol Vinyals - 2015

4 papers in library cite

P. Werbos - 1974

14 papers in library cite

Shijie Wang, Manning, C. Christopher - 2012

7 papers in library cite

Richard Socher, B. Huval, Christopher D. Manning, Andrew Y. Ng - 2012

7 papers in library cite

Alex Krizhevsky - 2010

3 papers in library cite

J. Lehmann, R. Isele, M. Jakob, A. Jentzsch, D. Kontokostas, P. N. Mendes, S. Hellmann, M. Morsey, P. V. Kleef, S. Auer - 2014

2 papers in library cite

J. Chorowski, D. Bahdanau, Kyunghyun Cho, Yoshua Bengio - 2014

2 papers in library cite

J. Mcauley, J. Leskovec - 2013

2 papers in library cite

A. C. Cachopo - 2015

1 paper in library cites

Hugo Larochelle, M. Mandel, Razvan Pascanu, Yoshua Bengio - 2012

1 paper in library cites

K. Lang - 1995

1 paper in library cites

Yann Dauphin, Yoshua Bengio - 2013

1 paper in library cites

Cited by

27

papers in your library

Cites

27

papers in your library

Read

on October 17, 2025

Your review

Tags

Paper Aliases

No aliases