GPT2-Summary by qingkongzhiqian

Chinese summary generation model based on GPT2

Created 5 years ago

405 stars

Top 71.8% on SourcePulse

Project Summary

This project provides a Chinese text summarization model based on the GPT-2 architecture. It is designed for researchers and developers working with Chinese NLP tasks who need a pre-trained model for generating concise summaries from longer texts. The benefit is a readily available, fine-tuned GPT-2 model for Chinese summarization.

How It Works

The model leverages the GPT-2 architecture, specifically fine-tuning the GPT2-Chinese and GPT2-chitchat models. It processes training data by concatenating text and its summary, then trains the model on this sequence. This approach aims to adapt the generative capabilities of GPT-2 for the specific task of Chinese text summarization.

Quick Start & Requirements

Install: pip install transformers==2.1.1 pytorch==1.3.1
Prerequisites: Python 3.6, PyTorch 1.3.1, Transformers 2.1.1.
Model Download: Requires downloading pre-trained weights from Baidu Netdisk links provided in the README.
Usage: Place downloaded model parameters into the summary_model folder and run interact.py.
Links:
- GPT2-Chinese: https://github.com/Morizeyao/GPT2-Chinese
- GPT2-chitchat: https://github.com/yangjianxin1/GPT2-chitchat
- Project Workflow: https://zhuanlan.zhihu.com/p/113869509

Highlighted Details

Fine-tuned on NLPCC summarization data (news corpus).
Uses a "sequential" concatenation method for training data.
Offers two pre-trained models: GPT2-nlpcc-summary and GPT2-wiki.
Interactive prediction allows pasting text directly into the console.

Maintenance & Community

The project appears to be a personal or academic endeavor with no explicit mention of active maintenance, community channels (like Discord/Slack), or a roadmap. It acknowledges contributions from the GPT2-Chinese and GPT2-chitchat projects.

Licensing & Compatibility

The README does not explicitly state a license. Given its reliance on GPT2-Chinese and GPT2-chitchat, users should verify the licenses of those underlying projects for compatibility, especially for commercial use.

Limitations & Caveats

The project is presented as an exploration and may not be suitable for production without retraining the general and summary models. The README notes that the summarization quality might be better in vertical domains than general news. The sampling-based decoding means generated summaries can vary.

Health Check

Last Commit

2 years ago

Responsiveness

Inactive

Pull Requests (30d)

Issues (30d)

Star History

0 stars in the last 30 days