Discover and explore top open-source AI tools and projects—updated daily.
yangjianxin1GPT2 for Chinese chitchat bot, fine-tuned for dialogue
Top 15.8% on SourcePulse
This repository provides a GPT-2 based model for Chinese chit-chat, inspired by Microsoft's DialoGPT. It offers pre-trained models and code for training, preprocessing, and interacting with a Chinese conversational AI, targeting developers and researchers interested in Chinese dialogue systems.
How It Works
The model is built upon HuggingFace's transformers library, utilizing a GPT-2 architecture. For training, multi-turn dialogue data is concatenated with special separator tokens ([CLS], [SEP]) and fed into the model for autoregressive training. Generation employs Temperature, Top-k, and Nucleus Sampling techniques to control output diversity and quality. The implementation simplifies DialoGPT's MMI (Man-Machine Interaction) generation method for faster inference.
Quick Start & Requirements
pip install -r requirements.txt (requirements not explicitly listed, but transformers==4.2.0 and pytorch==1.7.0 are mentioned).python interact.py --model_path <path_to_model>. CPU inference is supported via --no_cuda.Highlighted Details
model_epoch40_50w).Maintenance & Community
The project author has released several related Chinese NLP models, including Firefly (a conversational LLM), LLMPruner, OFA-Chinese, CLIP-Chinese, ClipCap-Chinese, and CPM Chinese text generation. No specific community links (Discord, Slack) are provided.
Licensing & Compatibility
The repository does not explicitly state a license. It references GPT2-Chinese and DialoGPT, which have their own licenses. Compatibility for commercial use or closed-source linking is not specified.
Limitations & Caveats
The project mentions a TODO for "多卡并行训练负载不均衡的问题" (imbalanced load in multi-GPU training). The specified dependencies (transformers==4.2.0, pytorch==1.7.0) are relatively old, which might lead to compatibility issues with newer libraries or hardware.
2 years ago
1 week
2noise