Cite Score
32
AI summary
This paper analyzes the output embedding matrix of neural network language models, revealing it as a valid word embedding. It introduces tying input and output embeddings and a regularization method, achieving reduced perplexity and smaller translation model sizes. It uses LSTM and other neural language models.
Main Contributions
Abstract
We study the topmost weight matrix of neural network language models. We show that this matrix constitutes a valid word embedding. When training language models, we recommend tying the input embedding and this output embedding. We analyze the resulting update rules and show that the tied embedding evolves in a more similar way to the output embedding than to the input embedding in the untied model. We also offer a new method of regularizing the output embedding. Our methods lead to a significant reduction in perplexity, as we are able to show on a variety of neural network language models. Finally, we show that weight tying can reduce the size of neural translation models to less than half of their original size without harming their performance.
Citation Graph
References [41]
Sepp Hochreiter, Jürgen Schmidhuber - 1997
94 papers in library cite
Tomas Mikolov, K. Chen, G. S. Corrado, Jeffrey Dean - 2013
26 papers in library cite
Tomas Mikolov, Ilya Sutskever, K. Chen, G. S. Corrado, Jeffrey Dean - 2013
32 papers in library cite
D. Bahdanau, Kyunghyun Cho, Yoshua Bengio - 2014
59 papers in library cite
Kyunghyun Cho, B. V. Merrienboer, C. G. Gulcehre, D. Bahdanau, F. Bougares, Holger Schwenk, Yoshua Bengio - 2014
38 papers in library cite
Ilya Sutskever, Oriol Vinyals, Quoc V. Le - 2014
58 papers in library cite
Jeffrey Dean - 2015
6 papers in library cite
Yoshua Bengio, R. Ducharme, Pascal Vincent - 2001
62 papers in library cite
M. P. Marcus, B. Santorini, Mary Ann Marcinkiewicz - 1993
22 papers in library cite
R. Sennrich, B. Haddow, Alexandra Birch - 2016
22 papers in library cite
Matthew D. Zeiler - 2012
13 papers in library cite
Tomas Mikolov, M. Karafiat, Lukas Burget, Jan Cernocky, Sanjeev Khudanpur - 2010
36 papers in library cite
A. L. Maas, R. E. Daly, P. T. Pham, Dong Huang, Andrew Y. Ng, Christopher Potts - 2011
12 papers in library cite
Alex Graves - 2013
27 papers in library cite
Wojciech Zaremba, Ilya Sutskever, Oriol Vinyals - 2014
22 papers in library cite
M. Sundermeyer, R. Schluter, Hermann Ney - 2010
7 papers in library cite
Yarin Gal - 2015
9 papers in library cite
N. Kalchbrenner, Phil Blunsom - 2013
27 papers in library cite
Razvan Pascanu, C. G. Gulcehre, Kyunghyun Cho, Yoshua Bengio - 2013
7 papers in library cite
A. Mnih, Geoffrey E. Hinton - 2009
16 papers in library cite
A. Mnih, Yee Whye Teh - 2012
5 papers in library cite
J. G. Zilly, R. K. Srivastava, J. Koutnik, Jürgen Schmidhuber - 2016
6 papers in library cite
Tomas Mikolov, Armand Joulin, S. Chopra, M. Mathieu, Marc'aurelio Ranzato - 2015
8 papers in library cite
N. Srivastava - 2013
6 papers in library cite
H. Inan, K. Khosravi, Richard Socher - 2017
6 papers in library cite
A. Axelrod, X. Fe, Jianfeng Gao - 2011
5 papers in library cite
R. Sennrich, B. Haddow, Alexandra Birch - 2016
5 papers in library cite
A. Mnih, Koray Kavukcuoglu - 2013
4 papers in library cite
P. Gage - 1994
3 papers in library cite
Yarin Gal, Zoubin Ghahramani - 2015
3 papers in library cite
F. Hill, R. Reichart, Anna Korhonen - 2015
3 papers in library cite
E. Bruni, N. K. Tran, M. Baroni - 2014
2 papers in library cite
B. Mitra, E. Nalisnick, N. Craswell, Rich Caruana - 2016
1 paper in library cites
S. Zhang, H. Jiang, Mimee Xu, J. Hou, L. R. Dai - 2015
1 paper in library cites
S. Baker, R. Reichart, Anna Korhonen - 2014
1 paper in library cites
Y. Miyamoto, Kyunghyun Cho - 2016
1 paper in library cites
G. Halawi, G. Dror, E. Gabrilovich, Y. Koren - 2012
1 paper in library cites
C. A. G. Ehre, M. Moczulski, M. Denil, Yoshua Bengio - 2016
1 paper in library cites
D. Greene, P. Cunningham - 2006
1 paper in library cites
T. Luong, Richard Socher, C. M. Ning - 2013
1 paper in library cites
Y. Goldberg, Omer Levy - 2014
1 paper in library cites
Cited by
7
papers in your library
Cites
23
papers in your library
Read
on August 10, 2025
Your review
Tags
Paper Aliases
No aliases