2017

Semi-Supervised Sequence Tagging With Bidirectional Language Models

M. E. Peters, W. Ammar, C. Bhagavatula, Russell Power

citations

Cite Score

35

AI summary

This paper introduces TagLM, a semi-supervised sequence tagging model using bidirectional language models (LMs). It achieves state-of-the-art results on CoNLL 2003 NER and CoNLL 2000 Chunking datasets by incorporating pre-trained context embeddings from LMs into token representations.

Main Contributions

  • Proposes TagLM, a semi-supervised approach that uses pre-trained bidirectional language models to augment token representations in sequence tagging models.
  • Achieves state-of-the-art results on the CoNLL 2003 NER task, surpassing previous systems that use other forms of transfer or joint learning.
  • Establishes a new state-of-the-art result on the CoNLL 2000 Chunking task.
  • Demonstrates that using both forward and backward LM embeddings boosts performance over using only forward LM embeddings.
  • Shows that domain-specific pre-training is unnecessary by applying a LM trained in the news domain to scientific papers.

Abstract

Pre-trained word embeddings learned from unlabeled text have become a standard component of neural network architectures for NLP tasks. However, in most cases, the recurrent network that operates on word-level representations to produce context sensitive representations is trained on relatively little labeled data. In this paper, we demonstrate a general semi-supervised approach for adding pre-trained context embeddings from bidirectional language models to NLP systems and apply it to sequence labeling tasks. We evaluate our model on two standard datasets for named entity recognition (NER) and chunking, and in both cases achieve state of the art results, surpassing previous systems that use other forms of transfer or joint learning with additional labeled data and task specific gazetteers.

Citation Graph

Loading graph...

References [45]

Sort:
Filter:

D. P. Kingma, Jimmy Lei Ba - 2014

49 papers in library cite

Sepp Hochreiter, Jürgen Schmidhuber - 1997

94 papers in library cite

Tomas Mikolov, Ilya Sutskever, K. Chen, G. S. Corrado, Jeffrey Dean - 2013

32 papers in library cite

Jeffrey Pennington, Richard Socher, Christopher D. Manning - 2014

31 papers in library cite

Quoc Le, Tomas Mikolov - 2014

13 papers in library cite

Yoshua Bengio, R. Ducharme, Pascal Vincent - 2001

62 papers in library cite

Ronan Collobert, Jason Weston, Leon Bottou, M. Karlen, Koray Kavukcuoglu, P. P. Kuksa - 2011

23 papers in library cite

Kyunghyun Cho, B. V. Merrienboer, D. Bahdanau, Yoshua Bengio - 2014

9 papers in library cite

Tomas Mikolov, M. Karafiat, Lukas Burget, Jan Cernocky, Sanjeev Khudanpur - 2010

36 papers in library cite

R. Kiros, Yuxuan Zhu, Ruslan Salakhutdinov, Richard S. Zemel, R. Urtasun, Antonio Torralba, Sanja Fidler - 2015

23 papers in library cite

N. Kalchbrenner, Phil Blunsom - 2013

27 papers in library cite

A. M. Dai, Quoc V. Le - 2015

27 papers in library cite

Li Fei Fei - 2015

3 papers in library cite

R. Jozefowicz, Oriol Vinyals, M. Schuster, Noam Shazeer, Yonghui Wu - 2016

20 papers in library cite

C. Chelba, Tomas Mikolov, M. Schuster, Q. Ge, T. Brants, P. Koehn, Tony Robinson - 2013

13 papers in library cite

F. Hill, Kyunghyun Cho, Anna Korhonen - 2016

12 papers in library cite

O. Melamud, J. Goldberger, Ido Dagan - 2016

5 papers in library cite

Jacob Devlin, Rabih Zbib, Zhongqiang Huang, Thomas Lamar, Richard Schwartz, John Makhoul - 2014

9 papers in library cite

H. Sak, A. W. Senior, F. Beaufays - 2014

5 papers in library cite

J. Lafferty, Andrew Mccallum, F. C. Pereira - 2001

6 papers in library cite

K. Hashimoto, Caiming Xiong, Y. Tsuruoka, Richard Socher - 2016

5 papers in library cite

Tal Linzen, E. Dupoux, Y. Goldberg - 2016

5 papers in library cite

P. Koehn - 2010

5 papers in library cite

Rie Kubota Ando, Tong Zhang - 2005

4 papers in library cite

E. F. T. K. Sang, F. D. Meulder - 2003

4 papers in library cite

G. Lample, M. Ballesteros, S. Subramanian, K. Kawakami, C. Dyer - 2016

4 papers in library cite

J. Suzuki, H. Isozaki - 2008

4 papers in library cite

K. Nigam, A. K. Mccallum, Sebastian Thrun, T. Mitchell - 2000

4 papers in library cite

A. Blum, T. Mitchell - 1998

3 papers in library cite

A. Sogaard, Y. Goldberg - 2016

3 papers in library cite

L. A. Ratinov, Dan Roth - 2009

3 papers in library cite

X. Ma, Eduard Hovy - 2016

3 papers in library cite

E. F. T. K. Sang, S. Buchholz - 2000

2 papers in library cite

D. Pierce, C. Cardie - 2001

2 papers in library cite

J. Chiu, E. Nichols - 2016

2 papers in library cite

Wentao Li, Andrew Mccallum - 2005

2 papers in library cite

A. Peris, F. Casacuberta - 2015

1 paper in library cites

V. Frinken, A. Fornes, J. Llados, J. M. Ogier - 2012

1 paper in library cites

G. Luo, X. Huang, Chin Yew Lin, Z. Nie - 2015

1 paper in library cites

A. Sogaard - 2013

1 paper in library cites

J. Suzuki, A. Fujino, H. Isozaki - 2007

1 paper in library cites

B. Mohit, R. Hwa - 2005

1 paper in library cites

W. Ammar, M. E. Peters, C. Bhagavatula, Russell Power - 2017

1 paper in library cites

Zhilin Yang, Ruslan Salakhutdinov, W. W. Cohen - 2017

1 paper in library cites

Cited by

5

papers in your library

Cites

19

papers in your library

Read

on October 24, 2025

Your review

Tags

Paper Aliases

No aliases