2008

A Unified Architecture for Natural Language Processing: Deep Neural Networks With Multitask Learning

Ronan Collobert, Jason Weston

citations

Cite Score

80

AI summary

This paper introduces a deep convolutional neural network architecture for various NLP tasks, trained jointly with weight-sharing. The model uses unlabeled data with multitask and semi-supervised learning, achieving state-of-the-art results in semantic role labeling on the PropBank dataset without using POS tags or parse tree features.

Main Contributions

  • Introduces a unified deep convolutional neural network architecture for various NLP tasks.
  • Demonstrates the effectiveness of multitask learning by jointly training the network on multiple tasks, including POS tagging, chunking, named entity recognition, semantic role labeling, and language modeling.
  • Proposes a novel semi-supervised learning approach by incorporating a language model trained on unlabeled data from Wikipedia.
  • Achieves state-of-the-art performance in semantic role labeling on the PropBank dataset without using explicit syntactic features, outperforming existing methods that rely on POS tags or parse trees.
  • Shows how the features (embedding) learnt by the lookup-table layer of the language model clusters semantically similar words.

Abstract

We describe a single convolutional neural network architecture that, given a sentence, outputs a host of language processing predictions: part-of-speech tags, chunks, named entity tags, semantic roles, semantically similar words and the likelihood that the sentence makes sense (grammatically and semantically) using a language model. The entire network is trained jointly on all these tasks using weight-sharing, an instance of multitask learning. All the tasks use labeled data except the language model which is learnt from unlabeled text and represents a novel form of semi-supervised learning for the shared tasks. We show how both multitask learning and semi-supervised learning improve the generalization of the shared tasks, resulting in state-of-the-art performance.

Citation Graph

Loading graph...

References [22]

Sort:
Filter:

Yann Lecun, Leon Bottou, Yoshua Bengio, Patrick Haffner - 1998

62 papers in library cite

Yoshua Bengio, R. Ducharme, Pascal Vincent - 2001

62 papers in library cite

Rich Caruana - 1997

13 papers in library cite

A. H. Waibel, T. Hanazawa, Geoffrey Hinton, K. Shikano, K. Lang - 1989

13 papers in library cite

Rie Kubota Ando, Tong Zhang - 2005

10 papers in library cite

Holger Schwenk, Jean Luc Gauvain - 2002

14 papers in library cite

O. Chapelle, B. Scholkopf, A. Zien - 2006

5 papers in library cite

T. Joachims - 1999

5 papers in library cite

D. Mcclosky, E. Charniak, M. J. Johnson - 2006

4 papers in library cite

S. Pradhan, W. Ward, K. Hacioglu, J. Martin, Dan Jurafsky - 2004

3 papers in library cite

M. Palmer, P. Kingsbury, D. Gildea - 2005

3 papers in library cite

D. Okanohara, J. Tsujii - 2007

2 papers in library cite

S. Miller, H. Fox, L. Ramshaw, R. Weischedel - 2000

2 papers in library cite

Charles Sutton, Andrew Mccallum - 2005

2 papers in library cite

Charles Sutton, Andrew Mccallum, K. Rohanimanesh - 2007

2 papers in library cite

Ronan Collobert, Jason Weston - 2007

2 papers in library cite

Charles Sutton, Andrew Mccallum - 2005

2 papers in library cite

G. Musillo, P. Merlo - 2006

2 papers in library cite

D. Gildea, M. Palmer - 2001

2 papers in library cite

N. Ueffing, G. Haffari, A. Sarkar - 2007

2 papers in library cite

B. Rosenfeld, R. Feldman - 2007

2 papers in library cite

Cited by

32

papers in your library

Cites

7

papers in your library

Read

on March 21, 2025

Your review

Tags

Paper Aliases

No aliases