GPT2-NewsTitle by liucongg

News title generator with GPT2, plus Streamlit web UI

created 4 years ago

1,115 stars

Top 35.0% on sourcepulse

Project Summary

This project provides a Chinese news title generation system using a fine-tuned GPT-2 model. It's designed for developers and researchers interested in text generation, offering a complete workflow from data preparation and model training to deployment via a web interface. The project aims to demystify the end-to-end process of GPT-2 generation models.

How It Works

The project leverages HuggingFace's transformers library to implement and train a GPT-2 model. A custom GPT-2 model architecture is used, modifying the loss calculation to focus specifically on predicting the title. The training process involves preparing a custom dataset of Chinese news articles and their titles, with a 6-layer GPT-2 model trained from random initialization.

Quick Start & Requirements

Install: pip install -r requirements.txt
Run Web UI: streamlit run app.py
Prerequisites: Python 3.6+, transformers==3.0.2, Flask==0.12.2, gevent==1.3a1. Requires downloading datasets and model weights from provided Baidu Netdisk links.
Setup Time: Downloading datasets and model weights can take significant time. Training from scratch requires substantial GPU resources and time.

Highlighted Details

Includes a Streamlit-based web UI for easy deployment and interaction.
Provides a comprehensive dataset collection for Chinese news summarization.
Offers detailed code comments and explanations for the training and generation pipeline.
The provided model is a small, 6-layer GPT-2 trained for only 5 epochs from random initialization, with results described as "general."

Maintenance & Community

The project was last updated on February 19, 2022. Contact information for the author (Cong Liu) is provided via email and Zhihu.

Licensing & Compatibility

The repository does not explicitly state a license. The project uses components from other open-source projects, and users should verify compatibility for commercial or closed-source applications.

Limitations & Caveats

The provided GPT-2 model is small and undertrained (5 epochs from random initialization), leading to generally average performance. The project relies on external Baidu Netdisk links for datasets and model weights, which may be subject to availability.

GPT2-NewsTitle by liucongg

Explore Similar Projects

GPT2 by affjljoo3581

MINI_LLM by jiahe7ay

gpt2client by rish-16

Steel-LLM by zhanshijinwat

train-llm-from-scratch by FareedKhan-dev

ru_transformers by mgrankin

aitextgen by minimaxir

gpt-2-Pytorch by graykode

gpt2-ml by imcaspar

GPT2 by ConnorJL

gpt-2 by openai

nanoGPT by karpathy