multiwoz  by budzianowski

Dataset for task-oriented dialogue modeling

created 6 years ago
909 stars

Top 40.9% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides the MultiWOZ dataset, a large-scale, human-human conversation corpus for task-oriented dialogue systems, along with baseline implementations and evaluation scripts. It is designed for researchers and developers working on dialogue state tracking and response generation.

How It Works

The project centers around the MultiWOZ dataset, which comprises over 10,000 dialogues across multiple domains, annotated with goals, utterances, and belief states. It supports end-to-end dialogue modeling and dialogue state tracking, offering various dataset versions (2.0, 2.1, 2.2) with corrections and improvements. The code includes preprocessing scripts and baseline models for training and evaluation.

Quick Start & Requirements

  • Install: Requires Python 2 with pip.
  • Dependencies: PyTorch 0.4.1.
  • Preprocessing: Run python create_delex_data.py.
  • Training: Run python train.py [--args=value].
  • Testing: Run python test.py [--args=value].
  • Dataset Access: Can be loaded through DialogStudio.

Highlighted Details

  • Comprehensive benchmarks for Dialogue State Tracking (DST) and Response Generation are provided, with results on MultiWOZ 2.0, 2.1, and 2.2.
  • Includes detailed hyperparameter settings for baseline models, enabling reproducibility.
  • Supports both end-to-end models and policy optimization models for response generation.
  • Offers bibtex citations for the dataset and related papers.

Maintenance & Community

The project was initiated by Paweł Budzianowski from Cambridge Dialogue Systems Group. Bug reports can be sent to budzianowski@gmail.com or jianguozhang@salesforce.com.

Licensing & Compatibility

Released under the MIT License, allowing for open-source use and modification.

Limitations & Caveats

The baseline code is specified for Python 2 and an older version of PyTorch (0.4.1), which may require significant adaptation for modern Python 3 environments. Some older benchmark results might not be directly comparable due to evaluation script inconsistencies.

Health Check
Last commit

6 months ago

Responsiveness

Inactive

Pull Requests (30d)
0
Issues (30d)
0
Star History
18 stars in the last 90 days

Explore Similar Projects

Starred by Lukas Biewald Lukas Biewald(Cofounder of Weights & Biases), Patrick von Platen Patrick von Platen(Core Contributor to Hugging Face Transformers and Diffusers), and
1 more.

DialoGPT by microsoft

0.1%
2k
Response generation model via large-scale pretraining
created 6 years ago
updated 2 years ago
Feedback? Help us improve.