2015

A Neural Conversational Model

Oriol Vinyals, Quoc V. Le

citations

Cite Score

59

AI summary

This paper introduces a neural conversational model using the sequence-to-sequence framework, demonstrating its ability to generate simple conversations, extract knowledge from domain-specific and open-domain datasets, and perform basic common sense reasoning, though it suffers from consistency issues.

Main Contributions

  • Proposes a simple neural conversational model based on the sequence-to-sequence framework.
  • Demonstrates the model's ability to generate simple conversations from large conversational datasets.
  • Shows the model can extract knowledge from both domain-specific and open-domain datasets.
  • The model can perform simple forms of common sense reasoning.
  • Highlights the lack of consistency as a common failure mode of the model.

Abstract

Conversational modeling is an important task in natural language understanding and machine intelligence. Although previous approaches exist, they are often restricted to specific domains (e.g., booking an airline ticket) and require hand-crafted rules. In this paper, we present a simple approach for this task which uses the recently proposed sequence to sequence framework. Our model converses by predicting the next sentence given the previous sentence or sentences in a conversation. The strength of our model is that it can be trained end-to-end and thus requires much fewer hand-crafted rules. We find that this straightforward model can generate simple conversations given a large conversational training dataset. Our preliminary results suggest that, despite optimizing the wrong objective function, the model is able to converse well. It is able extract knowledge from both a domain specific dataset, and from a large, noisy, and general domain dataset of movie subtitles. On a domain-specific IT helpdesk dataset, the model can find a solution to a technical problem via conversations. On a noisy open-domain movie transcript dataset, the model can perform simple forms of common sense reasoning. As expected, we also find that the lack of consistency is a common failure mode of our model.

Citation Graph

Loading graph...

References [18]

Sort:
Filter:

Sepp Hochreiter, Jürgen Schmidhuber - 1997

94 papers in library cite

D. Bahdanau, Kyunghyun Cho, Yoshua Bengio - 2014

59 papers in library cite

A. M. Turing - 1950

8 papers in library cite

Ilya Sutskever, Oriol Vinyals, Quoc V. Le - 2014

58 papers in library cite

Yoshua Bengio, R. Ducharme, Pascal Vincent - 2001

62 papers in library cite

Dumitru Erhan - 2015

11 papers in library cite

Tomas Mikolov, M. Karafiat, Lukas Burget, Jan Cernocky, Sanjeev Khudanpur - 2010

36 papers in library cite

N. Kalchbrenner, Phil Blunsom - 2013

27 papers in library cite

Yoshua Bengio - 2014

12 papers in library cite

Geoffrey Hinton - 2015

9 papers in library cite

T. Luong, Ilya Sutskever, Quoc V. Le, Oriol Vinyals, Wojciech Zaremba - 2014

14 papers in library cite

L. Shang, Z. L. Lu, H. Li - 2015

2 papers in library cite

A. Sordoni, M. Galley, Michael Auli, Chris Brockett, Yangfeng Ji, M. Mitchell, J. Y. Nie, Jianfeng Gao, B. Dolan - 2015

4 papers in library cite

Tomas Mikolov - 2012

17 papers in library cite

J. Lester, K. Branting, B. Mott - 2004

1 paper in library cites

T. Will - 2007

1 paper in library cites

Dan Jurafsky, J. Martin - 2009

1 paper in library cites

Cited by

7

papers in your library

Cites

13

papers in your library

Read

on October 30, 2025

Your review

Tags

Paper Aliases

No aliases