2013

Exploiting Similarities Among Languages for Machine Translation

Tomas Mikolov, Quoc V. Le, Ilya Sutskever

citations

Cite Score

54

AI summary

This paper introduces a method for machine translation that automates the process of generating dictionaries and phrase tables by learning a linear projection between vector spaces of languages using distributed word representations, achieving almost 90% precision@5 for translation of words between English and Spanish.

Main Contributions

  • Introduces a method that can automate the process of generating and extending dictionaries and phrase tables.
  • Translates missing word and phrase entries by learning language structures based on large monolingual data and mapping between languages from small bilingual data.
  • Uses distributed representation of words and learns a linear mapping between vector spaces of languages.
  • Achieves almost 90% precision@5 for translation of words between English and Spanish.
  • The method makes little assumption about the languages, so it can be used to extend and refine dictionaries and translation tables for any language pairs.

Abstract

Dictionaries and phrase tables are the basis of modern statistical machine translation systems. This paper develops a method that can automate the process of generating and extending dictionaries and phrase tables. Our method can translate missing word and phrase entries by learning language structures based on large monolingual data and mapping between languages from small bilingual data. It uses distributed representation of words and learns a linear mapping between vector spaces of languages. Despite its simplicity, our method is surprisingly effective: we can achieve almost 90% precision@5 for translation of words between English and Spanish. This method makes little assumption about the languages, so it can be used to extend and refine dictionaries and translation tables for any language pairs.

Citation Graph

Loading graph...

References [19]

Sort:
Filter:

Tomas Mikolov, K. Chen, G. S. Corrado, Jeffrey Dean - 2013

26 papers in library cite

Tomas Mikolov, Ilya Sutskever, K. Chen, G. S. Corrado, Jeffrey Dean - 2013

32 papers in library cite

D. E. Rumelhart, Geoffrey E. Hinton, Ronald J. Williams - 1986

34 papers in library cite

Jeffrey L. Elman - 1990

23 papers in library cite

Yoshua Bengio, R. Ducharme, Pascal Vincent - 2001

62 papers in library cite

Ronan Collobert, Jason Weston, Leon Bottou, M. Karlen, Koray Kavukcuoglu, P. P. Kuksa - 2011

23 papers in library cite

Tomas Mikolov, M. Karafiat, Lukas Burget, Jan Cernocky, Sanjeev Khudanpur - 2010

36 papers in library cite

Ronan Collobert, Jason Weston - 2008

32 papers in library cite

Tomas Mikolov, W. T. Yih, Geoffrey Zweig - 2013

8 papers in library cite

J. Turian, L. Ratinov, Yoshua Bengio - 2010

17 papers in library cite

Richard Socher, C. C. Lin, C. Manning, Andrew Y. Ng - 2011

10 papers in library cite

Eric H. Huang, Richard Socher, C. Manning, Andrew Y. Ng - 2012

7 papers in library cite

F. Morin, Yoshua Bengio - 2005

19 papers in library cite

A. Mnih, Geoffrey E. Hinton - 2009

16 papers in library cite

Tomas Mikolov - 2012

17 papers in library cite

Richard Socher, J. Bauer, Christopher D. Manning, Andrew Y. Ng - 2013

3 papers in library cite

P. Koehn, K. Knight - 2002

2 papers in library cite

A. Haghighi, Percy Liang, T. B. Kirkpatrick, Dan Klein - 2008

2 papers in library cite

P. Koehn, K. Knight - 2000

1 paper in library cites

Cited by

6

papers in your library

Cites

14

papers in your library

Read

on November 27, 2025

Your review

Tags

Paper Aliases

No aliases