2017
Cite Score
72
AI summary
This paper introduces a sequence-to-sequence model based entirely on convolutional neural networks (CNNs) with gated linear units and attention modules. The model outperforms deep LSTM models on WMT'14 English-German and English-French translation tasks, achieving faster speeds on both GPU and CPU.
Main Contributions
Abstract
The prevalent approach to sequence to sequence learning maps an input sequence to a variable length output sequence via recurrent neural networks. We introduce an architecture based entirely on convolutional neural networks. Compared to recurrent models, computations over all elements can be fully parallelized during training to better exploit the GPU hardware and optimization is easier since the number of non-linearities is fixed and independent of the input length. Our use of gated linear units eases gradient propagation and we equip each decoder layer with a separate attention module. We outperform the accuracy of the deep LSTM setup of Wu et al. (2016) on both WMT'14 English-German and WMT' 14 English-French translation at an order of magnitude faster speed, both on GPU and CPU.
Citation Graph
References [48]
K. He, X. Zhang, S. Ren, Jian Sun - 2016
20 papers in library cite
Sepp Hochreiter, Jürgen Schmidhuber - 1997
94 papers in library cite
S. Ioffe, Christian Szegedy - 2015
18 papers in library cite
N. Srivastava, Geoffrey E. Hinton, Alex Krizhevsky, Ilya Sutskever, Ruslan R. Salakhutdinov - 2014
20 papers in library cite
D. Bahdanau, Kyunghyun Cho, Yoshua Bengio - 2014
59 papers in library cite
Kyunghyun Cho, B. V. Merrienboer, C. G. Gulcehre, D. Bahdanau, F. Bougares, Holger Schwenk, Yoshua Bengio - 2014
38 papers in library cite
Ilya Sutskever, Oriol Vinyals, Quoc V. Le - 2014
58 papers in library cite
Yoshua Bengio - 2010
20 papers in library cite
K. He, X. Zhang, S. Ren, Jian Sun - 2015
10 papers in library cite
Chin Yew Lin - 2004
9 papers in library cite
Jeffrey L. Elman - 1990
23 papers in library cite
Jimmy Lei Ba, R. Kiros, Geoffrey E. Hinton - 2016
14 papers in library cite
T. Luong, H. Pham, Christopher D. Manning - 2015
15 papers in library cite
R. Sennrich, B. Haddow, Alexandra Birch - 2016
22 papers in library cite
Yonghui Wu, M. Schuster, Ziru Chen, Quoc V. Le, M. Norouzi, W. Macherey, M. Krikun, Yue Cao, Q. Gao, K. Macherey, J. Klingner, A. Shah, M. J. Johnson, Xiaodong Liu, Lukasz Kaiser, S. Gouws, Y. Kato, T. Kudo, H. Kazawa, K. Stevens, G. Kurian, N. Patil, Wenyi Wang, C. Young, J. Smith, J. Riesa, A. Rudnick, Oriol Vinyals, G. S. Corrado, M. Hughes, Jeffrey Dean - 2016
15 papers in library cite
Razvan Pascanu, Tomas Mikolov, Yoshua Bengio - 2013
21 papers in library cite
Ilya Sutskever, James Martens, G. Dahl, Geoffrey Hinton - 2013
13 papers in library cite
A. H. Waibel, T. Hanazawa, Geoffrey Hinton, K. Shikano, K. Lang - 1989
13 papers in library cite
Alexander M. Rush, S. Chopra, Jason Weston - 2015
13 papers in library cite
Noam Shazeer, Azalia Mirhoseini, K. Maziarz, A. Davis, Quoc Le, Geoffrey Hinton, Jeffrey Dean - 2017
9 papers in library cite
S. Sukhbaatar, A. Szlam, Jason Weston, Rob Fergus - 2015
18 papers in library cite
R. Nallapati, B. Zhou, C. N. D. Santos, C. G. Gulcehre, Bing Xiang - 2016
10 papers in library cite
Clement Farabet - 2011
5 papers in library cite
M. Schuster, Kaisuke Nakajima - 2012
3 papers in library cite
Yann N. Dauphin, A. Fan, Michael Auli, D. Grangier - 2016
8 papers in library cite
T. Salimans, D. A. Kingma, D. P. Diederik - 2016
4 papers in library cite
D. Ha, Andrew Dai, Quoc V. Le - 2016
3 papers in library cite
A. Miller, Adam Fisch, J. Dodge, A. Karimi, Antoine Bordes, Jason Weston - 2016
1 paper in library cites
N. Kalchbrenner, L. Espeholt, K. Simonyan, A. V. D. Oord, Alex Graves, Koray Kavukcuoglu - 2016
5 papers in library cite
J. Bradbury, S. Merity, Caiming Xiong, Richard Socher - 2016
1 paper in library cites
A. V. D. Oord, N. Kalchbrenner, Koray Kavukcuoglu - 2016
3 papers in library cite
J. Suzuki, M. Nagata - 2017
2 papers in library cite
Jingren Zhou, Yue Cao, Xinpeng Wang, P. L. Li, Weixin Xu - 2016
5 papers in library cite
R. Sennrich, B. Haddow, Alexandra Birch - 2016
5 papers in library cite
R. Parker, D. Graff, J. Kong, K. Chen, K. Maeda - 2011
5 papers in library cite
C. Dyer, V. Chahuneau, Noah A. Smith - 2013
4 papers in library cite
J. Chorowski, D. Bahdanau, D. Serdyuk, Kyunghyun Cho, Yoshua Bengio - 2015
3 papers in library cite
Yann Lecun, Yoshua Bengio - 1995
3 papers in library cite
Fanqing Meng, Z. L. Lu, Mingliang Wang, H. Li, W. Jiang, Qian Liu - 2015
3 papers in library cite
S. Jean, O. Firat, Kyunghyun Cho, R. Memisevic, Yoshua Bengio - 2015
3 papers in library cite
J. Gehring, Michael Auli, D. Grangier, Yann N. Dauphin - 2016
2 papers in library cite
P. Over, H. Dang, D. Harman - 2007
2 papers in library cite
A. Oord, N. Kalchbrenner, Oriol Vinyals, L. Espeholt, Alex Graves, Koray Kavukcuoglu - 2016
1 paper in library cites
O. Bojar, R. Chatterjee, C. Federmann, C. Graham, Y. Yvette, B. Haddow, M. Huck, A. J. Yepes, P. Koehn, V. Logacheva, C. Monz, M. Negri, A. Neveol, M. Neves, M. Popel, M. Post, R. Rubino, C. Scarton, L. Specia, M. Turchi, K. Verspoor, M. Zampieri - 2016
1 paper in library cites
S. Shen, Y. Zhao, Ze Liu, Maosong Sun - 2016
1 paper in library cites
Zhilin Yang, Z. Hu, Y. Deng, C. Dyer, A. Smola - 2016
1 paper in library cites
H. Mi, Zhengtao Wang, A. Ittycheriah - 2016
1 paper in library cites
G. L'hostis, D. Grangier, Michael Auli - 2016
1 paper in library cites
Cited by
3
papers in your library
Cites
32
papers in your library
Read
on August 23, 2025
Your review
Tags
Paper Aliases
No aliases