GPT2-NewsTitle  by liucongg

News title generator with GPT2, plus Streamlit web UI

created 4 years ago
1,115 stars

Top 35.0% on sourcepulse

GitHubView on GitHub
Project Summary

This project provides a Chinese news title generation system using a fine-tuned GPT-2 model. It's designed for developers and researchers interested in text generation, offering a complete workflow from data preparation and model training to deployment via a web interface. The project aims to demystify the end-to-end process of GPT-2 generation models.

How It Works

The project leverages HuggingFace's transformers library to implement and train a GPT-2 model. A custom GPT-2 model architecture is used, modifying the loss calculation to focus specifically on predicting the title. The training process involves preparing a custom dataset of Chinese news articles and their titles, with a 6-layer GPT-2 model trained from random initialization.

Quick Start & Requirements

  • Install: pip install -r requirements.txt
  • Run Web UI: streamlit run app.py
  • Prerequisites: Python 3.6+, transformers==3.0.2, Flask==0.12.2, gevent==1.3a1. Requires downloading datasets and model weights from provided Baidu Netdisk links.
  • Setup Time: Downloading datasets and model weights can take significant time. Training from scratch requires substantial GPU resources and time.

Highlighted Details

  • Includes a Streamlit-based web UI for easy deployment and interaction.
  • Provides a comprehensive dataset collection for Chinese news summarization.
  • Offers detailed code comments and explanations for the training and generation pipeline.
  • The provided model is a small, 6-layer GPT-2 trained for only 5 epochs from random initialization, with results described as "general."

Maintenance & Community

The project was last updated on February 19, 2022. Contact information for the author (Cong Liu) is provided via email and Zhihu.

Licensing & Compatibility

The repository does not explicitly state a license. The project uses components from other open-source projects, and users should verify compatibility for commercial or closed-source applications.

Limitations & Caveats

The provided GPT-2 model is small and undertrained (5 epochs from random initialization), leading to generally average performance. The project relies on external Baidu Netdisk links for datasets and model weights, which may be subject to availability.

Health Check
Last commit

3 years ago

Responsiveness

1 week

Pull Requests (30d)
0
Issues (30d)
0
Star History
8 stars in the last 90 days

Explore Similar Projects

Starred by George Hotz George Hotz(Author of tinygrad; Founder of the tiny corp, comma.ai) and Ross Taylor Ross Taylor(Cofounder of General Reasoning; Creator of Papers with Code).

GPT2 by ConnorJL

0%
1k
GPT2 training implementation, supporting TPUs and GPUs
created 6 years ago
updated 2 years ago
Feedback? Help us improve.