2013
Cite Score
97
AI summary
This paper introduces extensions to the Skip-gram model, including subsampling frequent words, negative sampling, and a method for finding phrases in text, achieving improved vector quality and training speed. It demonstrates the model's ability to capture syntactic and semantic relationships, and its application to phrase representation.
Main Contributions
Abstract
The recently introduced continuous Skip-gram model is an efficient method for learning high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships. In this paper we present several extensions that improve both the quality of the vectors and the training speed. By subsampling of the frequent words we obtain significant speedup and also learn more regular word representations. We also describe a simple alternative to the hierarchical softmax called negative sampling. An inherent limitation of word representations is their indifference to word order and their inability to represent idiomatic phrases. For example, the meanings of "Canada" and "Air” cannot be easily combined to obtain “Air Canada". Motivated by this example, we present a simple method for finding phrases in text, and show that learning good vector representations for millions of phrases is possible.
Citation Graph
References [20]
Tomas Mikolov, K. Chen, G. S. Corrado, Jeffrey Dean - 2013
26 papers in library cite
D. E. Rumelhart, Geoffrey E. Hinton, Ronald J. Williams - 1986
34 papers in library cite
Yoshua Bengio, R. Ducharme, Pascal Vincent - 2001
62 papers in library cite
Ronan Collobert, Jason Weston - 2008
32 papers in library cite
Tomas Mikolov, W. T. Yih, Geoffrey Zweig - 2013
8 papers in library cite
J. Turian, L. Ratinov, Yoshua Bengio - 2010
17 papers in library cite
Xavier Glorot, Antoine Bordes, Yoshua Bengio - 2011
3 papers in library cite
Richard Socher, C. C. Lin, C. Manning, Andrew Y. Ng - 2011
10 papers in library cite
Tomas Mikolov, S. Kombrink, Lukas Burget, Jan Cernocky, Sanjeev Khudanpur - 2011
16 papers in library cite
F. Morin, Yoshua Bengio - 2005
19 papers in library cite
A. Mnih, Geoffrey E. Hinton - 2009
16 papers in library cite
A. Mnih, Yee Whye Teh - 2012
5 papers in library cite
Tomas Mikolov, A. Deoras, D. Povey, Lukas Burget, Jan Cernocky - 2011
9 papers in library cite
Holger Schwenk - 2007
12 papers in library cite
P. D. Turney - 2013
1 paper in library cites
Tomas Mikolov - 2012
17 papers in library cite
Richard Socher, B. Huval, Christopher D. Manning, Andrew Y. Ng - 2012
7 papers in library cite
P. D. Turney, P. Pantel - 2010
6 papers in library cite
Jason Weston, Samy Bengio, Nicolas Usunier - 2011
3 papers in library cite
M. U. Gutmann, A. Hyvarinen - 2012
2 papers in library cite
Cited by
32
papers in your library
Cites
14
papers in your library
Read
on March 17, 2025
Your review
Tags
Paper Aliases
No aliases