2015

Multi-Task Sequence to Sequence Learning

M. T. Luong, Quoc V. Le, Ilya Sutskever, Oriol Vinyals, Lukasz Kaiser

citations

Cite Score

36

AI summary

This paper introduces multi-task sequence-to-sequence learning, applying it to machine translation, constituency parsing, and image caption generation. It establishes a new state-of-the-art result in constituent parsing with 93.0 F1 and shows that syntactic parsing and image caption generation improves the translation quality between English and German.

Main Contributions

  • Introduces three multi-task learning (MTL) settings for sequence-to-sequence models: one-to-many, many-to-one, and many-to-many.
  • Demonstrates that training on parsing and image caption data improves translation quality between English and German by up to 1.5 BLEU points.
  • Establishes a new state-of-the-art result in constituent parsing with 93.0 F1.
  • Reveals properties of autoencoder and skip-thought objectives in the MTL context.
  • Explores how MTL can be useful for parsing, yielding an improvement of up to +8.9 F1 points over the baseline.

Abstract

Sequence to sequence learning has recently emerged as a new paradigm in supervised learning. To date, most of its applications focused on only one task and not much work explored this framework for multiple tasks. This paper examines three multi-task learning (MTL) settings for sequence to sequence models: (a) the one-to-many setting – where the encoder is shared between several tasks such as machine translation and syntactic parsing, (b) the many-to-one setting – useful when only the decoder can be shared, as in the case of translation and image caption generation, and (c) the many-to-many setting – where multiple encoders and decoders are shared, which is the case with unsupervised objectives and translation. Our results show that training on a small amount of parsing and image caption data can improve the translation quality between English and German by up to 1.5 BLEU points over strong single-task baselines on the WMT benchmarks. Furthermore, we have established a new state-of-the-art result in constituent parsing with 93.0 F1. Lastly, we reveal interesting properties of the two unsupervised learning objectives, autoencoder and skip-thought, in the MTL context: autoencoder helps less in terms of perplexities but more on BLEU scores compared to skip-thought.

Citation Graph

Loading graph...

References [32]

Sort:
Filter:

Sepp Hochreiter, Jürgen Schmidhuber - 1997

94 papers in library cite

D. Bahdanau, Kyunghyun Cho, Yoshua Bengio - 2014

59 papers in library cite

Kyunghyun Cho, B. V. Merrienboer, C. G. Gulcehre, D. Bahdanau, F. Bougares, Holger Schwenk, Yoshua Bengio - 2014

38 papers in library cite

K. Papineni, S. Roukos, T. Ward, Wei Jing Zhu - 2002

19 papers in library cite

Ilya Sutskever, Oriol Vinyals, Quoc V. Le - 2014

58 papers in library cite

K. Xu, Jimmy Lei Ba, R. Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, R. Zemel, Yoshua Bengio - 2015

12 papers in library cite

T. Luong, H. Pham, Christopher D. Manning - 2015

15 papers in library cite

Rich Caruana - 1997

13 papers in library cite

M. P. Marcus, B. Santorini, Mary Ann Marcinkiewicz - 1993

22 papers in library cite

Dumitru Erhan - 2015

11 papers in library cite

J. Donahue, Y. Jia, Oriol Vinyals, J. Hoffman, N. Zhang, E. Tzeng, Trevor Darrell - 2014

15 papers in library cite

R. Kiros, Yuxuan Zhu, Ruslan Salakhutdinov, Richard S. Zemel, R. Urtasun, Antonio Torralba, Sanja Fidler - 2015

23 papers in library cite

N. Kalchbrenner, Phil Blunsom - 2013

27 papers in library cite

Rie Kubota Ando, Tong Zhang - 2005

10 papers in library cite

A. M. Dai, Quoc V. Le - 2015

27 papers in library cite

R. Jozefowicz, Oriol Vinyals, M. Schuster, Noam Shazeer, Yonghui Wu - 2016

20 papers in library cite

Yoshua Bengio - 2014

12 papers in library cite

Geoffrey Hinton - 2015

9 papers in library cite

T. Luong, Ilya Sutskever, Quoc V. Le, Oriol Vinyals, Wojciech Zaremba - 2014

14 papers in library cite

Sebastian Thrun - 1996

3 papers in library cite

V. Pham, T. Bluche, C. Kermorvant, J. Louradour - 2014

5 papers in library cite

D. Dong, H. Wu, Weiran He, D. Yu, Haiming Wang - 2015

2 papers in library cite

C. G. Gulcehre, O. Firat, K. Xu, Kyunghyun Cho, L. Barrault, H. C. Lin, F. Bougares, Holger Schwenk, Yoshua Bengio - 2015

3 papers in library cite

O. Bojar, R. Chatterjee, C. Federmann, B. Haddow, M. Huck, C. Hokamp, P. Koehn, V. Logacheva, C. Monz, M. Negri, M. Post, C. Scarton, L. Specia, M. Turchi - 2015

3 papers in library cite

S. Jean, O. Firat, Kyunghyun Cho, R. Memisevic, Yoshua Bengio - 2015

3 papers in library cite

A. Argyriou, T. Evgeniou, M. Pontil - 2006

3 papers in library cite

T. Evgeniou, M. Pontil - 2004

3 papers in library cite

Georg Heigold, Vincent Vanhoucke, A. Senior, P. Nguyen, M. A. Ranzato, M. Devin, Jeffrey Dean - 2013

2 papers in library cite

Xiaodong Liu, Jianfeng Gao, X. He, K. Duh, Y. Y. Wang - 2015

2 papers in library cite

J. T. Huang, Jeffrey Li, D. Yu, L. Deng, Y. Gong - 2013

1 paper in library cites

A. Kumar, H. D. Iii - 2012

1 paper in library cites

M. T. Luong, Christopher D. Manning - 2015

1 paper in library cites

Cited by

4

papers in your library

Cites

23

papers in your library

Read

on August 17, 2025

Your review

Tags

Paper Aliases

No aliases