Cite Score
51
AI summary
This paper introduces two semi-supervised learning approaches for sequence learning with recurrent networks: predicting the next sequence element and sequence autoencoders. LSTMs pretrained with these approaches show improved stability and generalization, achieving strong performance on text classification tasks such as IMDB, DBpedia and 20 Newsgroups.
Main Contributions
Abstract
We present two approaches that use unlabeled data to improve sequence learning with recurrent networks. The first approach is to predict what comes next in a sequence, which is a conventional language model in natural language processing. The second approach is to use a sequence autoencoder, which reads the input sequence into a vector and predicts the input sequence again. These two algorithms can be used as a “pretraining” step for a later supervised sequence learning algorithm. In other words, the parameters obtained from the unsupervised step can be used as a starting point for other supervised training models. In our experiments, we find that long short term memory recurrent networks after being pretrained with the two approaches are more stable and generalize better. With pretraining, we are able to train long short term memory recurrent networks up to a few hundred timesteps, thereby achieving strong performance in many text classification tasks, such as IMDB, DBpedia and 20 Newsgroups.
Citation Graph
References [38]
Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton - 2012
71 papers in library cite
Sepp Hochreiter, Jürgen Schmidhuber - 1997
94 papers in library cite
D. E. Rumelhart, Geoffrey E. Hinton, Ronald J. Williams - 1986
34 papers in library cite
Ilya Sutskever, Oriol Vinyals, Quoc V. Le - 2014
58 papers in library cite
Yoon Kim - 2014
8 papers in library cite
Quoc Le, Tomas Mikolov - 2014
13 papers in library cite
Richard Socher, A. Perelygin, Jeffrey Wu, J. Chuang, C. Manning, A. Ng, Christopher Potts - 2013
24 papers in library cite
Felix A. Gers, Jürgen Schmidhuber, Fred Cummins - 2000
13 papers in library cite
Dumitru Erhan - 2015
11 papers in library cite
K. Greff, R. K. Srivastava, J. Koutn'ik, B. R. Steunebrink, Jürgen Schmidhuber - 2015
4 papers in library cite
Tomas Mikolov, M. Karafiat, Lukas Burget, Jan Cernocky, Sanjeev Khudanpur - 2010
36 papers in library cite
A. L. Maas, R. E. Daly, P. T. Pham, Dong Huang, Andrew Y. Ng, Christopher Potts - 2011
12 papers in library cite
Wojciech Zaremba, Ilya Sutskever, Oriol Vinyals - 2014
22 papers in library cite
R. Kiros, Yuxuan Zhu, Ruslan Salakhutdinov, Richard S. Zemel, R. Urtasun, Antonio Torralba, Sanja Fidler - 2015
23 papers in library cite
Sepp Hochreiter, Yoshua Bengio, Paolo Frasconi, Jürgen Schmidhuber - 2001
16 papers in library cite
Oriol Vinyals, Quoc V. Le - 2015
7 papers in library cite
Rie Kubota Ando, Tong Zhang - 2005
10 papers in library cite
Yoshua Bengio - 2014
12 papers in library cite
Geoffrey Hinton - 2015
9 papers in library cite
T. Luong, Ilya Sutskever, Quoc V. Le, Oriol Vinyals, Wojciech Zaremba - 2014
14 papers in library cite
X. Zhang, J. Zhao, Yann Lecun - 2015
7 papers in library cite
N. Srivastava, E. Mansimov, Ruslan Salakhutdinov - 2015
3 papers in library cite
J. Y. H. Ng, M. J. Hausknecht, S. Vijayanarasimhan, Oriol Vinyals, R. Monga, G. Toderici - 2015
1 paper in library cites
L. Shang, Z. L. Lu, H. Li - 2015
2 papers in library cite
K. R. Johnson, Tong Zhang - 2014
1 paper in library cites
W. Chan, Navdeep Jaitly, Quoc V. Le, Oriol Vinyals - 2015
4 papers in library cite
P. Werbos - 1974
14 papers in library cite
Bo Pang, L. Lee - 2005
13 papers in library cite
Shijie Wang, Manning, C. Christopher - 2012
7 papers in library cite
Richard Socher, B. Huval, Christopher D. Manning, Andrew Y. Ng - 2012
7 papers in library cite
Alex Krizhevsky - 2010
3 papers in library cite
J. Lehmann, R. Isele, M. Jakob, A. Jentzsch, D. Kontokostas, P. N. Mendes, S. Hellmann, M. Morsey, P. V. Kleef, S. Auer - 2014
2 papers in library cite
J. Chorowski, D. Bahdanau, Kyunghyun Cho, Yoshua Bengio - 2014
2 papers in library cite
J. Mcauley, J. Leskovec - 2013
2 papers in library cite
A. C. Cachopo - 2015
1 paper in library cites
Hugo Larochelle, M. Mandel, Razvan Pascanu, Yoshua Bengio - 2012
1 paper in library cites
K. Lang - 1995
1 paper in library cites
Yann Dauphin, Yoshua Bengio - 2013
1 paper in library cites
Cited by
27
papers in your library
Cites
27
papers in your library
Read
on October 17, 2025
Your review
Tags
Paper Aliases
No aliases