transfer-learning-conv-ai  by huggingface

Conversational AI code for transfer learning research

created 6 years ago
1,754 stars

Top 25.0% on sourcepulse

GitHubView on GitHub
Project Summary

This repository provides code for building state-of-the-art conversational AI agents using transfer learning from OpenAI's GPT and GPT-2 models. It's designed for researchers and developers aiming to reproduce results from the NeurIPS 2018 ConvAI2 competition or fine-tune their own dialogue systems. The project offers clean, commented code for training and inference, with options for distributed training and FP16 precision.

How It Works

The core approach leverages transfer learning from pre-trained Transformer language models (GPT, GPT-2). Dialogue history and personality context are fed into the model, which then generates responses. The training script incorporates options for multi-task learning, including language modeling and multiple-choice objectives, to improve conversational quality. The use of nucleus sampling for decoding is highlighted for a more compelling human-like interaction compared to beam search.

Quick Start & Requirements

  • Install: git clone the repo, cd into it, and run pip install -r requirements.txt. Also requires python -m spacy download en.
  • Docker: Build with docker build -t convai . (ensure sufficient memory allocation).
  • Pretrained Model: Run python interact.py to automatically download and use a fine-tuned model.
  • Dependencies: Python, PyTorch, spaCy, Apex (for FP16). GPU with CUDA is recommended for training.
  • Resources: Training on 8 V100 GPUs takes about an hour.

Highlighted Details

  • Reproduces state-of-the-art results from the ConvAI2 competition.
  • Offers distributed training and FP16 support for faster training.
  • Includes scripts for training, inference, and ConvAI2 evaluation.
  • Fine-tuned model available for immediate use.

Maintenance & Community

This project is associated with Hugging Face. Specific community channels or active maintenance status are not detailed in the README.

Licensing & Compatibility

The repository is licensed under the MIT License. This license permits commercial use and integration with closed-source projects.

Limitations & Caveats

The README notes that results may be slightly lower than original competition results without additional tweaks like custom position embeddings. It also mentions that beam search, while improving F1, offers a less compelling human experience than the provided nucleus sampling.

Health Check
Last commit

2 years ago

Responsiveness

1 day

Pull Requests (30d)
0
Issues (30d)
0
Star History
7 stars in the last 90 days

Explore Similar Projects

Feedback? Help us improve.