GPT2-chitchat  by yangjianxin1

GPT2 for Chinese chitchat bot, fine-tuned for dialogue

Created 5 years ago
3,013 stars

Top 15.9% on SourcePulse

GitHubView on GitHub
1 Expert Loves This Project
Project Summary

This repository provides a GPT-2 based model for Chinese chit-chat, inspired by Microsoft's DialoGPT. It offers pre-trained models and code for training, preprocessing, and interacting with a Chinese conversational AI, targeting developers and researchers interested in Chinese dialogue systems.

How It Works

The model is built upon HuggingFace's transformers library, utilizing a GPT-2 architecture. For training, multi-turn dialogue data is concatenated with special separator tokens ([CLS], [SEP]) and fed into the model for autoregressive training. Generation employs Temperature, Top-k, and Nucleus Sampling techniques to control output diversity and quality. The implementation simplifies DialoGPT's MMI (Man-Machine Interaction) generation method for faster inference.

Quick Start & Requirements

  • Install: Clone the repository and install dependencies: pip install -r requirements.txt (requirements not explicitly listed, but transformers==4.2.0 and pytorch==1.7.0 are mentioned).
  • Prerequisites: Python 3.6+, PyTorch 1.7.0+.
  • Usage: Download pre-trained models from the "模型分享" section. Run interactive chat with python interact.py --model_path <path_to_model>. CPU inference is supported via --no_cuda.
  • Resources: Pre-trained models are available via Baidu Netdisk or Google Drive. Training requires significant computational resources.
  • Links: Model Sharing (password: ju6m), 50w Corpus (password: 4g5e), 100w Corpus (password: s908).

Highlighted Details

  • Implements MMI ideas from DialoGPT, simplified for speed.
  • Supports various generation sampling methods (Temperature, Top-k, Nucleus).
  • Includes detailed Chinese comments within the code.
  • Offers pre-processed datasets (50w, 100w multi-turn dialogues).
  • Provides pre-trained model weights (e.g., model_epoch40_50w).

Maintenance & Community

The project author has released several related Chinese NLP models, including Firefly (a conversational LLM), LLMPruner, OFA-Chinese, CLIP-Chinese, ClipCap-Chinese, and CPM Chinese text generation. No specific community links (Discord, Slack) are provided.

Licensing & Compatibility

The repository does not explicitly state a license. It references GPT2-Chinese and DialoGPT, which have their own licenses. Compatibility for commercial use or closed-source linking is not specified.

Limitations & Caveats

The project mentions a TODO for "多卡并行训练负载不均衡的问题" (imbalanced load in multi-GPU training). The specified dependencies (transformers==4.2.0, pytorch==1.7.0) are relatively old, which might lead to compatibility issues with newer libraries or hardware.

Health Check
Last Commit

1 year ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
1 stars in the last 30 days

Explore Similar Projects

Starred by Omar Sanseviero Omar Sanseviero(DevRel at Google DeepMind), Li Jiang Li Jiang(Coauthor of AutoGen; Engineer at Microsoft), and
2 more.

ChatTTS by 2noise

0.2%
38k
Generative speech model for daily dialogue
Created 1 year ago
Updated 2 months ago
Feedback? Help us improve.