GPT2 for Chinese chitchat bot, fine-tuned for dialogue
Top 16.2% on sourcepulse
This repository provides a GPT-2 based model for Chinese chit-chat, inspired by Microsoft's DialoGPT. It offers pre-trained models and code for training, preprocessing, and interacting with a Chinese conversational AI, targeting developers and researchers interested in Chinese dialogue systems.
How It Works
The model is built upon HuggingFace's transformers
library, utilizing a GPT-2 architecture. For training, multi-turn dialogue data is concatenated with special separator tokens ([CLS]
, [SEP]
) and fed into the model for autoregressive training. Generation employs Temperature, Top-k, and Nucleus Sampling techniques to control output diversity and quality. The implementation simplifies DialoGPT's MMI (Man-Machine Interaction) generation method for faster inference.
Quick Start & Requirements
pip install -r requirements.txt
(requirements not explicitly listed, but transformers==4.2.0
and pytorch==1.7.0
are mentioned).python interact.py --model_path <path_to_model>
. CPU inference is supported via --no_cuda
.Highlighted Details
model_epoch40_50w
).Maintenance & Community
The project author has released several related Chinese NLP models, including Firefly (a conversational LLM), LLMPruner, OFA-Chinese, CLIP-Chinese, ClipCap-Chinese, and CPM Chinese text generation. No specific community links (Discord, Slack) are provided.
Licensing & Compatibility
The repository does not explicitly state a license. It references GPT2-Chinese and DialoGPT, which have their own licenses. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The project mentions a TODO for "多卡并行训练负载不均衡的问题" (imbalanced load in multi-GPU training). The specified dependencies (transformers==4.2.0
, pytorch==1.7.0
) are relatively old, which might lead to compatibility issues with newer libraries or hardware.
1 year ago
Inactive