2014

A Sick Cure for the Evaluation of Compositional Distributional Semantic Models

Marco Marelli, S. Menini, M. Baroni, L. Bentivogli, R. Bernardi, R. Z. Elli

citations

Cite Score

47

AI summary

This paper introduces SICK, a large English benchmark dataset of ~10,000 sentence pairs, annotated for semantic relatedness and textual entailment using crowdsourcing, to evaluate compositional distributional semantic models.

Main Contributions

  • Introduction of SICK (Sentences Involving Compositional Knowledge), a new large-scale English benchmark dataset for evaluating compositional distributional semantic models (CDSMs).
  • SICK contains approximately 10,000 sentence pairs rich in lexical, syntactic, and semantic phenomena relevant to CDSMs, while avoiding issues like multiword expressions or named entities.
  • Crowdsourcing techniques were used to annotate each sentence pair for semantic relatedness (5-point scale) and textual entailment (entailment, contradiction, neutral).
  • The SICK dataset was utilized in SemEval-2014 Task 1, demonstrating its practical application in evaluation campaigns.
  • The dataset is freely available for research purposes, promoting further development and assessment of CDSMs.

Abstract

Shared and internationally recognized benchmarks are fundamental for the development of any computational system. We aim to help the research community working on compositional distributional semantic models (CDSMs) by providing SICK (Sentences Involving Compositional Knowledge), a large size English benchmark tailored for them. SICK consists of about 10,000 English sentence pairs that include many examples of the lexical, syntactic and semantic phenomena that CDSMs are expected to account for, but do not require dealing with other aspects of existing sentential data sets (idiomatic multiword expressions, named entities, telegraphic language) that are not within the scope of CDSMs. By means of crowdsourcing techniques, each pair was annotated for two crucial semantic tasks: relatedness in meaning (with a 5-point rating scale as gold score) and entailment relation between the two elements (with three possible gold labels: entailment, contradiction, and neutral). The SICK data set was used in SemEval-2014 Task 1, and it freely available for research purposes.

Citation Graph

Loading graph...

References [15]

Sort:
Filter:

Richard Socher, B. Huval, Christopher D. Manning, Andrew Y. Ng - 2012

7 papers in library cite

Marco Marelli, L. Bentivogli, M. Baroni, R. Bernardi, S. Menini, R. Zamparelli - 2014

7 papers in library cite

J. Mitchell, Mirella Lapata - 2010

5 papers in library cite

Edward Grefenstette, M. Sadrzadeh - 2011

3 papers in library cite

R. Cooper, D. Crouch, J. Eijck, C. Fox, J. Genabith, J. Jaspars, H. Kamp, D. Milward, M. Pinkal, M. Poesio, S. Pulman, T. Briscoe, H. Maier, K. Konrad - 1996

3 papers in library cite

J. Mitchell, Mirella Lapata - 2008

3 papers in library cite

R. Snow, B. O'connor, Dan Jurafsky, Andrew Y. Ng - 2008

2 papers in library cite

M. Baroni, R. Zamparelli - 2010

2 papers in library cite

B. Maccartney, Christopher D. Manning - 2007

1 paper in library cites

H. Abdi, L. Williams - 2010

1 paper in library cites

A. Toledo, S. Katrenko, S. Alexandropoulou, H. Klockmann, A. Stern, Ido Dagan, Y. Winter - 2012

1 paper in library cites

E. Agirre, D. Cer, M. Diab, A. G. Agirre - 2012

1 paper in library cites

M. Negri, A. Marchetti, Y. Mehdad, L. Bentivogli, D. Giampiccolo - 2012

1 paper in library cites

M. Negri, A. Marchetti, Y. Mehdad, L. Bentivogli, D. Giampiccolo - 2012

1 paper in library cites

D. Giampiccolo, H. T. Dang, M. Bernardo, Ido Dagan, E. Cabrio - 2008

1 paper in library cites

Cited by

7

papers in your library

Cites

0

papers in your library

Read

on February 18, 2026

Your review

Tags

Paper Aliases

No aliases