Conversational AI code for transfer learning research
Top 25.0% on sourcepulse
This repository provides code for building state-of-the-art conversational AI agents using transfer learning from OpenAI's GPT and GPT-2 models. It's designed for researchers and developers aiming to reproduce results from the NeurIPS 2018 ConvAI2 competition or fine-tune their own dialogue systems. The project offers clean, commented code for training and inference, with options for distributed training and FP16 precision.
How It Works
The core approach leverages transfer learning from pre-trained Transformer language models (GPT, GPT-2). Dialogue history and personality context are fed into the model, which then generates responses. The training script incorporates options for multi-task learning, including language modeling and multiple-choice objectives, to improve conversational quality. The use of nucleus sampling for decoding is highlighted for a more compelling human-like interaction compared to beam search.
Quick Start & Requirements
git clone
the repo, cd
into it, and run pip install -r requirements.txt
. Also requires python -m spacy download en
.docker build -t convai .
(ensure sufficient memory allocation).python interact.py
to automatically download and use a fine-tuned model.Highlighted Details
Maintenance & Community
This project is associated with Hugging Face. Specific community channels or active maintenance status are not detailed in the README.
Licensing & Compatibility
The repository is licensed under the MIT License. This license permits commercial use and integration with closed-source projects.
Limitations & Caveats
The README notes that results may be slightly lower than original competition results without additional tweaks like custom position embeddings. It also mentions that beam search, while improving F1, offers a less compelling human experience than the provided nucleus sampling.
2 years ago
1 day