News title generator with GPT2, plus Streamlit web UI
Top 35.0% on sourcepulse
This project provides a Chinese news title generation system using a fine-tuned GPT-2 model. It's designed for developers and researchers interested in text generation, offering a complete workflow from data preparation and model training to deployment via a web interface. The project aims to demystify the end-to-end process of GPT-2 generation models.
How It Works
The project leverages HuggingFace's transformers
library to implement and train a GPT-2 model. A custom GPT-2 model architecture is used, modifying the loss calculation to focus specifically on predicting the title. The training process involves preparing a custom dataset of Chinese news articles and their titles, with a 6-layer GPT-2 model trained from random initialization.
Quick Start & Requirements
pip install -r requirements.txt
streamlit run app.py
transformers==3.0.2
, Flask==0.12.2
, gevent==1.3a1
. Requires downloading datasets and model weights from provided Baidu Netdisk links.Highlighted Details
Maintenance & Community
The project was last updated on February 19, 2022. Contact information for the author (Cong Liu) is provided via email and Zhihu.
Licensing & Compatibility
The repository does not explicitly state a license. The project uses components from other open-source projects, and users should verify compatibility for commercial or closed-source applications.
Limitations & Caveats
The provided GPT-2 model is small and undertrained (5 epochs from random initialization), leading to generally average performance. The project relies on external Baidu Netdisk links for datasets and model weights, which may be subject to availability.
3 years ago
1 week