2017

Convolutional Sequence to Sequence Learning

J. Gehring, Michael Auli, D. Grangier, D. Yarats, Yann Dauphin

citations

Cite Score

72

AI summary

This paper introduces a sequence-to-sequence model based entirely on convolutional neural networks (CNNs) with gated linear units and attention modules. The model outperforms deep LSTM models on WMT'14 English-German and English-French translation tasks, achieving faster speeds on both GPU and CPU.

Main Contributions

  • Introduces a fully convolutional architecture for sequence-to-sequence learning, enabling parallelization and easier optimization compared to recurrent models.
  • Employs gated linear units to improve gradient propagation and equips each decoder layer with a separate attention module.
  • Achieves state-of-the-art results on WMT'16 English-Romanian translation, outperforming the winning entry by 1.9 BLEU.
  • Outperforms deep LSTM models on WMT'14 English-German and English-French translation tasks, with significant improvements in BLEU scores.
  • Demonstrates faster translation speeds compared to LSTM-based models on both GPU and CPU hardware.

Abstract

The prevalent approach to sequence to sequence learning maps an input sequence to a variable length output sequence via recurrent neural networks. We introduce an architecture based entirely on convolutional neural networks. Compared to recurrent models, computations over all elements can be fully parallelized during training to better exploit the GPU hardware and optimization is easier since the number of non-linearities is fixed and independent of the input length. Our use of gated linear units eases gradient propagation and we equip each decoder layer with a separate attention module. We outperform the accuracy of the deep LSTM setup of Wu et al. (2016) on both WMT'14 English-German and WMT' 14 English-French translation at an order of magnitude faster speed, both on GPU and CPU.

Citation Graph

Loading graph...

References [48]

Sort:
Filter:

K. He, X. Zhang, S. Ren, Jian Sun - 2016

20 papers in library cite

Sepp Hochreiter, Jürgen Schmidhuber - 1997

94 papers in library cite

S. Ioffe, Christian Szegedy - 2015

18 papers in library cite

N. Srivastava, Geoffrey E. Hinton, Alex Krizhevsky, Ilya Sutskever, Ruslan R. Salakhutdinov - 2014

20 papers in library cite

D. Bahdanau, Kyunghyun Cho, Yoshua Bengio - 2014

59 papers in library cite

Kyunghyun Cho, B. V. Merrienboer, C. G. Gulcehre, D. Bahdanau, F. Bougares, Holger Schwenk, Yoshua Bengio - 2014

38 papers in library cite

Ilya Sutskever, Oriol Vinyals, Quoc V. Le - 2014

58 papers in library cite

Yoshua Bengio - 2010

20 papers in library cite

K. He, X. Zhang, S. Ren, Jian Sun - 2015

10 papers in library cite

Chin Yew Lin - 2004

9 papers in library cite

Jeffrey L. Elman - 1990

23 papers in library cite

Jimmy Lei Ba, R. Kiros, Geoffrey E. Hinton - 2016

14 papers in library cite

T. Luong, H. Pham, Christopher D. Manning - 2015

15 papers in library cite

R. Sennrich, B. Haddow, Alexandra Birch - 2016

22 papers in library cite

Yonghui Wu, M. Schuster, Ziru Chen, Quoc V. Le, M. Norouzi, W. Macherey, M. Krikun, Yue Cao, Q. Gao, K. Macherey, J. Klingner, A. Shah, M. J. Johnson, Xiaodong Liu, Lukasz Kaiser, S. Gouws, Y. Kato, T. Kudo, H. Kazawa, K. Stevens, G. Kurian, N. Patil, Wenyi Wang, C. Young, J. Smith, J. Riesa, A. Rudnick, Oriol Vinyals, G. S. Corrado, M. Hughes, Jeffrey Dean - 2016

15 papers in library cite

Razvan Pascanu, Tomas Mikolov, Yoshua Bengio - 2013

21 papers in library cite

Ilya Sutskever, James Martens, G. Dahl, Geoffrey Hinton - 2013

13 papers in library cite

A. H. Waibel, T. Hanazawa, Geoffrey Hinton, K. Shikano, K. Lang - 1989

13 papers in library cite

Alexander M. Rush, S. Chopra, Jason Weston - 2015

13 papers in library cite

Noam Shazeer, Azalia Mirhoseini, K. Maziarz, A. Davis, Quoc Le, Geoffrey Hinton, Jeffrey Dean - 2017

9 papers in library cite

S. Sukhbaatar, A. Szlam, Jason Weston, Rob Fergus - 2015

18 papers in library cite

R. Nallapati, B. Zhou, C. N. D. Santos, C. G. Gulcehre, Bing Xiang - 2016

10 papers in library cite

Clement Farabet - 2011

5 papers in library cite

M. Schuster, Kaisuke Nakajima - 2012

3 papers in library cite

Yann N. Dauphin, A. Fan, Michael Auli, D. Grangier - 2016

8 papers in library cite

T. Salimans, D. A. Kingma, D. P. Diederik - 2016

4 papers in library cite

D. Ha, Andrew Dai, Quoc V. Le - 2016

3 papers in library cite

A. Miller, Adam Fisch, J. Dodge, A. Karimi, Antoine Bordes, Jason Weston - 2016

1 paper in library cites

N. Kalchbrenner, L. Espeholt, K. Simonyan, A. V. D. Oord, Alex Graves, Koray Kavukcuoglu - 2016

5 papers in library cite

J. Bradbury, S. Merity, Caiming Xiong, Richard Socher - 2016

1 paper in library cites

A. V. D. Oord, N. Kalchbrenner, Koray Kavukcuoglu - 2016

3 papers in library cite

J. Suzuki, M. Nagata - 2017

2 papers in library cite

Jingren Zhou, Yue Cao, Xinpeng Wang, P. L. Li, Weixin Xu - 2016

5 papers in library cite

R. Sennrich, B. Haddow, Alexandra Birch - 2016

5 papers in library cite

R. Parker, D. Graff, J. Kong, K. Chen, K. Maeda - 2011

5 papers in library cite

C. Dyer, V. Chahuneau, Noah A. Smith - 2013

4 papers in library cite

J. Chorowski, D. Bahdanau, D. Serdyuk, Kyunghyun Cho, Yoshua Bengio - 2015

3 papers in library cite

Yann Lecun, Yoshua Bengio - 1995

3 papers in library cite

Fanqing Meng, Z. L. Lu, Mingliang Wang, H. Li, W. Jiang, Qian Liu - 2015

3 papers in library cite

S. Jean, O. Firat, Kyunghyun Cho, R. Memisevic, Yoshua Bengio - 2015

3 papers in library cite

J. Gehring, Michael Auli, D. Grangier, Yann N. Dauphin - 2016

2 papers in library cite

P. Over, H. Dang, D. Harman - 2007

2 papers in library cite

A. Oord, N. Kalchbrenner, Oriol Vinyals, L. Espeholt, Alex Graves, Koray Kavukcuoglu - 2016

1 paper in library cites

O. Bojar, R. Chatterjee, C. Federmann, C. Graham, Y. Yvette, B. Haddow, M. Huck, A. J. Yepes, P. Koehn, V. Logacheva, C. Monz, M. Negri, A. Neveol, M. Neves, M. Popel, M. Post, R. Rubino, C. Scarton, L. Specia, M. Turchi, K. Verspoor, M. Zampieri - 2016

1 paper in library cites

S. Shen, Y. Zhao, Ze Liu, Maosong Sun - 2016

1 paper in library cites

Zhilin Yang, Z. Hu, Y. Deng, C. Dyer, A. Smola - 2016

1 paper in library cites

H. Mi, Zhengtao Wang, A. Ittycheriah - 2016

1 paper in library cites

G. L'hostis, D. Grangier, Michael Auli - 2016

1 paper in library cites

Cited by

3

papers in your library

Cites

32

papers in your library

Read

on August 23, 2025

Your review

Tags

Paper Aliases

No aliases