Cite Score
89
AI summary
This paper introduces an extension of the continuous skip-gram model that represents words as the sum of character n-grams, enabling the learning of word representations that take into account subword information. The model is evaluated on nine languages and achieves state-of-the-art results on word similarity and analogy tasks.
Main Contributions
Abstract
Continuous word representations, trained on large unlabeled corpora are useful for many natural language processing tasks. Popular models that learn such representations ignore the morphology of words, by assigning a distinct vector to each word. This is a limitation, especially for languages with large vocabularies and many rare words. In this paper, we propose a new approach based on the skipgram model, where each word is represented as a bag of character n-grams. A vector representation is associated to each character n-gram; words being represented as the sum of these representations. Our method is fast, allowing to train models on large corpora quickly and allows us to compute word representations for words that did not appear in the training data. We evaluate our word representations on nine different languages, both on word similarity and analogy tasks. By comparing to recently proposed morphological word representations, we show that our vectors achieve state-of-the-art performance on these tasks.
Citation Graph
References [40]
Tomas Mikolov, K. Chen, G. S. Corrado, Jeffrey Dean - 2013
26 papers in library cite
Tomas Mikolov, Ilya Sutskever, K. Chen, G. S. Corrado, Jeffrey Dean - 2013
32 papers in library cite
R. Sennrich, B. Haddow, Alexandra Birch - 2016
22 papers in library cite
Ronan Collobert, Jason Weston - 2008
32 papers in library cite
Ilya Sutskever, James Martens, Geoffrey E. Hinton - 2011
13 papers in library cite
Tomas Mikolov, Ilya Sutskever, A. Deoras, H. S. Le, S. Kombrink, Jan Cernocky - 2012
7 papers in library cite
S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, R. Harshman - 1990
12 papers in library cite
X. Zhang, J. Zhao, Yann Lecun - 2015
7 papers in library cite
P. D. Turney, P. Pantel - 2010
6 papers in library cite
Benjamin Recht, C. Re, S. Wright, F. Niu - 2011
6 papers in library cite
W. Ling, C. Dyer, A. W. Black, I. Trancoso, R. Fermandez, S. Amir, L. Marujo, T. Luis - 2015
5 papers in library cite
T. Luong, Richard Socher, Christopher D. Manning - 2013
4 papers in library cite
M. T. Luong, Christopher D. Manning - 2016
3 papers in library cite
M. Ballesteros, C. Dyer, Noah A. Smith - 2015
3 papers in library cite
K. Lund, C. Burgess - 1996
3 papers in library cite
Hinrich Schutze - 1993
3 papers in library cite
J. Wieting, Mohit Bansal, Kevin Gimpel, Karen Livescu - 2016
2 papers in library cite
J. A. Botha, Phil Blunsom - 2014
2 papers in library cite
C. N. D. Santos, M. Gatti - 2014
2 papers in library cite
M. Baroni, A. Lenci - 2010
2 papers in library cite
A. Alexandrescu, K. Kirchhoff - 2006
2 papers in library cite
C. Shaoul, C. Westbury - 2010
2 papers in library cite
Piotr Bojanowski, Armand Joulin, Tomas Mikolov - 2015
1 paper in library cites
T. Zesch, I. Gurevych - 2006
1 paper in library cites
S. Qiu, Q. Cui, J. Bian, B. Gao, T. Y. Liu - 2014
1 paper in library cites
A. Lazaridou, Marco Marelli, R. Zamparelli, M. Baroni - 2013
1 paper in library cites
Hinrich Schutze - 1992
1 paper in library cites
A. Panchenko, D. Ustalov, N. Arefyev, D. Paperno, N. Konstantinova, N. Loukachevitch, C. Biemann - 2016
1 paper in library cites
X. Chen, L. Xu, Ze Liu, Maosong Sun, H. Luan - 2015
1 paper in library cites
Q. Cui, B. Gao, J. Bian, S. Qiu, H. Dai, T. Y. Liu - 2015
1 paper in library cites
C. N. D. Santos, B. Zadrozny - 2014
1 paper in library cites
H. Sperr, J. Niehues, A. Waibel - 2013
1 paper in library cites
R. Cotterell, Hinrich Schutze - 2015
1 paper in library cites
H. Sak, M. Saraclar, T. Gungor - 2010
1 paper in library cites
M. Koper, C. Scheible, S. S. I. Walde - 2015
1 paper in library cites
D. E. Rumelhart, Geoffrey E. Hinton, Ronald J. Williams - 1988
1 paper in library cites
L. Svoboda, T. Brychcin - 2016
1 paper in library cites
G. Chrupala - 2014
1 paper in library cites
C. Spearman - 1904
1 paper in library cites
Radu Soricut, F. Och - 2015
1 paper in library cites
Cited by
7
papers in your library
Cites
8
papers in your library
Read
on November 27, 2025
Your review
Tags
Paper Aliases
No aliases