2007

Biographies, Bollywood, Boom-Boxes and Blenders: Domain Adaptation for Sentiment Classification

John Blitzer, Mark Dredze, Fernando Pereira

citations

Cite Score

63

AI summary

This paper extends the Structural Correspondence Learning (SCL) algorithm for sentiment classification, reducing relative error by 30-46%, and introduces an A-distance measure for domain similarity, aiding in selecting source domains for improved classifier transferability across product review datasets from Amazon (books, DVDs, electronics, kitchen appliances).

Main Contributions

  • Extended the Structural Correspondence Learning (SCL) algorithm for sentiment classification.
  • Proposed a new pivot selection method for SCL based on mutual information with source labels (SCL-MI), improving adaptation performance.
  • Introduced a method to correct feature misalignments using a small amount of labeled target domain data, achieving a 46% average relative reduction in error.
  • Identified and evaluated the A-distance as a measure of domain similarity that correlates with adaptation loss, which can be used to select optimal source domains for annotation.
  • Constructed and utilized a new dataset of Amazon product reviews across four different product types (books, DVDs, electronics, and kitchen appliances) for sentiment domain adaptation.

Abstract

Automatic sentiment classification has been extensively studied and applied in recent years. However, sentiment is expressed differently in different domains, and annotating corpora for every possible domain of interest is impractical. We investigate domain adaptation for sentiment classifiers, focusing on online reviews for different types of products. First, we extend to sentiment classification the recently-proposed structural correspondence learning (SCL) algorithm, reducing the relative error due to adaptation between domains by an average of 30% over the original SCL algorithm and 46% over a supervised baseline. Second, we identify a measure of domain similarity that correlates well with the potential for adaptation of a classifier from one domain to another. This measure could for instance be used to select a small set of domains to annotate whose trained classifiers would transfer well to many other domains.

Citation Graph

Loading graph...

References [13]

Sort:
Filter:

Rie Kubota Ando, Tong Zhang - 2005

10 papers in library cite

John Blitzer, R. Mcdonald, Fernando Pereira - 2006

4 papers in library cite

Bo Pang, L. Lee, S. Vaithyanathan - 2002

4 papers in library cite

S. B. David, John Blitzer, K. Crammer, Fernando Pereira - 2006

3 papers in library cite

M. Thomas, Bo Pang, L. Lee - 2006

2 papers in library cite

A. B. Goldberg, Jiacheng Zhu - 2006

2 papers in library cite

S. Das, Mark Chen - 2001

2 papers in library cite

R. Florian, H. Hassan, A. Ittycheriah, H. Jing, N. Kambhatla, X. Luo, N. Nicolov, S. Roukos - 2004

1 paper in library cites

C. Chelba, Alex Acero - 2004

1 paper in library cites

A. Aue, M. Gamon - 2005

1 paper in library cites

Cited by

4

papers in your library

Cites

1

papers in your library

Read

on January 26, 2026

Your review

Tags

Paper Aliases

No aliases